AI Ethics Not Being Taught to Data Scientist

This feels like an extension of ethics, in general, not being part of the curriculum in education.

Anaconda’s survey of data scientists from more than 100 countries found the ethics gap extends from academia to industry. While organizations can mitigate the problem through fairness tools and explainability solutions, neither appears to be gaining mass adoption.

Only 15% of respondents said their organization has implemented a fairness system, and just 19% reported they have an explainability tool in place.

The study authors warned that this could have far-reaching consequences:

Above and beyond the ethical concerns at play, a failure to proactively address these areas poses strategic risk to enterprises and institutions across competitive, financial, and even legal dimensions.

The survey also revealed concerns around the security of open-source tools and business training, and data drudgery. But it’s the disregard of ethics that most troubled the study authors:

Of all the trends identified in our study, we find the slow progress to address bias and fairness, and to make machine learning explainable the most concerning. While these two issues are distinct, they are interrelated, and both pose important questions for society, industry, and academia.

While businesses and academics are increasingly talking about AI ethics, their words mean little if they don’t turn into actions.

Is AutoML the End of Data Science

Feels a bit overstated, but an interesting read on AutoML and its potential impacts on Data Science (and scientists)

There’s a good reason for all the AutoML hype: AutoML is a must-have for many organizations.

Let’s take the example of Salesforce. They explain that their “customers are looking to predict a host of outcomes — from customer churn, sales forecasts and lead conversions to email marketing click throughs, website purchases, offer acceptances, equipment failures, late payments, and much more.”

In short, ML is ubiquitous. However, for ML to be effective for each unique customer, they would “have to build and deploy thousands of personalized machine learning models trained on each individual customer’s data for every single use case” and “the only way to achieve this without hiring an army of data scientists is through automation.”

While many people see AutoML as a way to bring ease-of-use and efficiency to ML, the reality is that for many enterprise applications, there’s just no other way to do it. A company like Facebook or Salesforce or Google can’t hire data scientists to build custom models for each of their billions of users, so they automate ML instead, enabling unique models at scale.

The amount of ML components that are automated depends on the platform, but with Salesforce, it includes feature inference, automated feature engineering, automated feature validation, automated model selection, and hyperparameter optimization.

That’s a mouthful.

What this means is that data scientists can deploy thousands of models in production, with far less grunt work and hand-tuning, reducing turn-around-time drastically.

By shifting the work from data crunching towards more meaningful analytics, AutoML enables more creative, business-focused applications of data science.

IEEE Ethical Design Initiative

A three-year effort by hundreds of engineers worldwide resulted in the publication in March of 2019 of Ethically Aligned Design (EAD) for Business, a guide for policymakers, engineers, designers, developers and corporations. The effort was headed by the IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems (A/IS), with John C. Havens as Executive Director, who spoke to AI Trends for an Executive Interview. We recently connected to ask how the effort has been going. Here is an update.

EAD First Edition, a 290-page document which Havens refers to as “applied ethics,” has seen some uptake, for example by IBM, which referred to the IEEE effort within their own resource called Everyday Ethics for AI  The IBM document is 26 pages, easy to digest, structured into five areas of focus, each with recommended action steps and an example. The example for Accountability involved an AI team developing applications for a hotel. Among the recommendations was: enable guests to turn the AI off, conduct face-to-face interviews to help develop requirements; and, institute a feedback learning loop.

The OECD (Organization for Economic Cooperation and Development) issued a paper after the release of an earlier version of EAD attesting to the close affinity between the IEEE’s work and the OECD Principles on AI. The OECD cited as shared values “the need for such systems to primarily serve human well-being through inclusive and sustainable growth; to respect human-centered values and fairness; and to be robust, safe and dependable, including through transparency, explainability and accountability.”

Self-Evolving Artificial Intelligence?

Teaching algorithms to create novel algorithms…

Artificial intelligence (AI) is evolving—literally. Researchers have created software that borrows concepts from Darwinian evolution, including “survival of the fittest,” to build AI programs that improve generation after generation without human input. The program replicated decades of AI research in a matter of days, and its designers think that one day, it could discover new approaches to AI.

“While most people were taking baby steps, they took a giant leap into the unknown,” says Risto Miikkulainen, a computer scientist at the University of Texas, Austin, who was not involved with the work. “This is one of those papers that could launch a lot of future research.”

Building an AI algorithm takes time. Take neural networks, a common type of machine learning used for translating languages and driving cars. These networks loosely mimic the structure of the brain and learn from training data by altering the strength of connections between artificial neurons. Smaller subcircuits of neurons carry out specific tasks—for instance spotting road signs—and researchers can spend months working out how to connect them so they work together seamlessly.

In recent years, scientists have sped up the process by automating some steps. But these programs still rely on stitching together ready-made circuits designed by humans. That means the output is still limited by engineers’ imaginations and their existing biases.

So Quoc Le, a computer scientist at Google, and colleagues developed a program called AutoML-Zero that could develop AI programs with effectively zero human input, using only basic mathematical concepts a high school student would know. “Our ultimate goal is to actually develop novel machine learning concepts that even researchers could not find,” he says.

AI Transparency and Fairness

A post on efforts to further bolster AI transparency and fairness by the AI World Society.

Learning algorithms find patterns in data they are given. However, in the processes by which the data is collected, relevant variables are defined and hypotheses are formulated that may depend on structural unfairness found in society, the paper suggests.

“Algorithms based on such data could introduce or perpetuate a variety of discriminatory biases, thereby maintaining a cycle of injustice,” the authors state. “The community within statistics and machine learning that works on issues of fairness in data analysis have taken a variety of approaches to defining fairness formally, with the aim of ultimately ensuring that learning algorithms are fair.”

The paper poses some tough questions. For instance, “Since, unsurprisingly, learning algorithms that use unfair data can lead to biased or unfair conclusions, two questions immediately suggest themselves. First, what does it mean for a world and data that comes from this world to be fair? And second, if data is indeed unfair, what adjustments must be made to learning algorithms that use this data as input to produce fairer outputs?”

Cause and effect is a challenging area of statistics; correlation does not imply causation, the experts say. Teasing out causality often involved obtaining data in a carefully controlled way. An early example is the work done by James Lindt for the Royal Navy, when scurvy among sailors was a health crisis. Lindt organized what later came to be viewed as one of the first instances of a clinical trial. He arranged 12 sailors into six pairs, and gave each pair one of six scurvy treatments thought at the time to be effective. Of the treatments, only citrus was effective. That led to citrus products being issued on all Royal Navy ships.

Whether fairness can be defined by computer scientists and engineers is an open question. “Issues of fairness and justice have occupied the ethical, legal, and political literature for centuries. While many general principles are known, such as fairness-as-proportionality, just compensation, and social equality, general definitions have proven elusive,” the paper states.

Moreover, “Indeed, a general definition may not be possible since notions of fairness are ultimately rooted in either ethical principle or ethical intuition, and both principles and intuitions may conflict.”

Mediation analysis is one approach to making algorithms more fair. Needless to say, the work is continuing.

Trustworthy AI Framework

An interesting article on business challenges with artificial intelligence.

Artificial intelligence (AI) technology continues to advance by leaps and bounds and is quickly becoming a potential disrupter and essential enabler for nearly every company in every industry. At this stage, one of the barriers to widespread AI deployment is no longer the technology itself; rather, it’s a set of challenges that ironically are far more human: ethics, governance, and human values.

As AI expands into almost every aspect of modern life, the risks of misbehaving AI increase exponentially—to a point where those risks can literally become a matter of life and death. Real-world examples of AI gone awry include systems that discriminate against people based on their race, age, or gender and social media systems that inadvertently spread rumors and disinformation and more.

Even worse, these examples are just the tip of the iceberg. As AI is deployed on a larger scale, the associated risks will likely only increase—potentially having serious consequences for society at large, and even greater consequences for the companies responsible. From a business perspective, these potential consequences include everything from lawsuits, regulatory fines, and angry customers to embarrassment, reputation damage, and destruction of shareholder value.

Yet with AI now becoming a required business capability—not just a “nice to have”—companies no longer have the option to avoid AI’s unique risks simply by avoiding AI altogether. Instead, they must learn how to identify and manage AI risks effectively. In order to achieve the potential of human and machine collaboration, organizations need to communicate a plan for AI that is adopted and spoken from the mailroom to the boardroom. By having an ethical framework in place, organizations create a common language by which to articulate trust and help ensure integrity of data among all of their internal and external stakeholders. Having a common framework and lens to apply the governance and management of risks associated with AI consistently across the enterprise can enable faster, and more consistent adoption of AI.

Is an ‘AI Winter’ Coming?

A BBC post speculating on whether there is a cooling off coming for AI

The last decade was a big one for artificial intelligence but researchers in the field believe that the industry is about to enter a new phase.

Hype surrounding AI has peaked and troughed over the years as the abilities of the technology get overestimated and then re-evaluated.

The peaks are known as AI summers, and the troughs AI winters.

The 10s were arguably the hottest AI summer on record with tech giants repeatedly touting AI’s abilities.

AI pioneer Yoshua Bengio, sometimes called one of the “godfathers of AI”, told the BBC that AI’s abilities were somewhat overhyped in the 10s by certain companies with an interest in doing so.

There are signs, however, that the hype might be about to start cooling off.

AI BS

or Artificial Intelligence Bull Shitake

There are a lot of claims being made, and as this article points out, not many of them are supported by strong evidence/math.

In Rebooting AI, Ernie Davis and I made six recommendations, each geared towards how readers – and journalists – and researchers might equally assess each new result that they achieve, asking the same set of questions in a limit section in the discussion of their papers:


Stripping away the rhetoric, what does the AI system actually do? Does a “reading system” really read?


How general is the result? (Could a driving system that works in Phoenix work as well in Mumbai? Would a Rubik’s cube system work in opening bottles? How much retraining would be required?)


Is there a demo where interested readers can probe for themselves?


If AI system is allegedly better than humans, then which humans, and how much better? (A comparison is low wage workers with little incentive to do well may not truly probe the limits of human ability)


How far does succeeding at the particular task actually take us toward building genuine AI?


How robust is the system? Could it work just as well with other data sets, without massive amounts of retraining? AlphaGo works fine on a 19×19 board, but would need to be retrained to play on a rectangular board; the lack of transfer is telling.

Problems with AI Transparency

As more and more business decisions get handed over (sometime blindly) to computer algorithms (aka ‘AI’), companies are very late to the game in considering what the consequences of that delegation will yield. As a buffer against these consequences, a company may want to be more transparent about how it’s algorithms work but that is not without it’s challenges.

To start, companies attempting to utilize artificial intelligence need to recognize that there are costs associated with transparency. This is not, of course, to suggest that transparency isn’t worth achieving, simply that it also poses downsides that need to be fully understood. These costs should be incorporated into a broader risk model that governs how to engage with explainable models and the extent to which
information about the model is available to others.

Second, organizations must also recognize that security is becoming an increasing concern in the world of AI. As AI is adopted more widely, more security vulnerabilities and bugs will surely be discovered, as my colleagues and I at the Future of Privacy Forum recently argued. Indeed, security may be one of the biggest long-term barriers to the adoption of AI.

Clever ‘AI’ or Poor Definition?

These types of articles seem to come down to the insatiable need for writers to sensationalize things that they don’t necessarily understand.

For example, in the scenario outlined in the article, it is unlikely that the ‘AI’ (aka computer algorithm) was self aware and said to itself “hey, I have a comprehensive understanding of humans and their capabilities, so I will modify myself to ‘cheat’ at this task in a way that a human would find difficult to detect”.

More likely is that the algorithm was poorly defined and the brute force computational model (aka ‘AI’) found a way to ‘solve’ the problem in a way that wasn’t contemplated by the software developer.

This clever AI hid data from its creators to cheat at its appointed task

Feed Shark