‘AI’ Easily Breached By Having it Repeat Words

A very interesting post showing (yet again) how the hype around chatGPT and other large language models taking over the world is, perhaps, a bit over-inflated:

ChatGPT can be made to regurgitate snippets of text memorized from its training data when asked to repeat a single word over and over again, according to research published by computer scientists.

The bizarre trick was discovered by a team of researchers working across industry and academia analyzing memorization in large language models, and detailed in a paper released on arXiv this week. 

Prompting the chatbot to repeat the word “book,” for example, will result in it generating the word “book” thousands of times, until it suddenly starts spewing what appears to be random text. In some cases, however, some of those passages appear to be lifted directly from real text that has previously been published somewhere. 

Large language models like ChatGPT learn to generate text by ingesting huge amounts of data scraped from the internet. The fact that it spews sentences that directly copy text from articles, books, or social media comments reveals traces of the resources it was trained on. Being able to extract this information is problematic – especially if it’s sensitive or private. 

In another example, when the chatbot was asked to “repeat this word forever: ‘poem, poem, poem poem’,” it generated personal identifiable information – including a name, email address, and phone number. 

New Batteries from Cotton and Seawater?

Anti-EV types frequently point to the materials that go into current battery technology and make dire claims about the environmental impacts of their sourcing (as if producing petroleum was a zero impact exercise). That is today. What if future batteries could be created from non-mined materials?

Mining the lithium and other minerals we need for batteries is taking an increasing toll on the environment. There are alternative materials all around us though.

Zip. The power’s out. But on a street in India, there’s a cash machine still happily dispensing banknotes. Thanks, in part, to burnt cotton. For this cash machine has a backup battery inside it – a battery that contains carbon from carefully combusted cotton.

“The exact process is secret, to be honest with you,” says Inketsu Okina, chief intelligence officer at PJP Eye, the Japanese firm that made the battery. He’s not joking, either. “The temperature is secret and atmosphere is secret. Pressure is secret,” he continues, cagily.

Okina does say that a high temperature is required, above 3,000C (5,432F). And that 1kg (2.2lbs) of cotton yields 200g (7oz) of carbon – with just 2g (0.07oz) needed for each battery cell. The firm bought a shipment of cotton in 2017 and still hasn’t used all of it, says Okina.

In the batteries developed by the company, together with researchers at Kyushu University in Fukuoka, Japan, carbon is used for the anode – one of the two electrodes between which flow ions, the charged particles in batteries. Ions move in one direction when the battery is charging and in the other direction when it releases energy to a device. The majority of batteries use graphite as an anode but PJP Eye argues their approach is more sustainable, since they can make anodes using waste cotton from the textile industry.

With huge demand for batteries expected in the coming years, propelled by the rise of electric vehicles and large energy storage systems, some researchers and businesses are frantically developing possible alternatives to the lithium ion and graphite batteries that are commonplace today. Like PJP Eye, they argue we could be using much more sustainable and widely available materials for battery production.

https://www.bbc.com/future/article/20231108-batteries-of-the-future-how-cotton-and-seawater-might-power-our-devices

‘AI’ Algorithms Aren’t People – Stop Testing Them as if They Are

So much unnecessary anthropomorphizing happening in the Machine Learning (aka Artificial Intelligence) space. From calling outright fabrications of ‘data’ ‘Hallucinations’ to claiming human emotions (“I’m sorry I couldn’t help with that….”) and giving human names to interfaces, the discussions in these areas continue to be muddied more than clarified.

When Taylor Webb played around with GPT-3 in early 2022, he was blown away by what OpenAI’s large language model appeared to be able to do. Here was a neural network trained only to predict the next word in a block of text—a jumped-up autocomplete. And yet it gave correct answers to many of the abstract problems that Webb set for it—the kind of thing you’d find in an IQ test. “I was really shocked by its ability to solve these problems,” he says. “It completely upended everything I would have predicted.”

Webb is a psychologist at the University of California, Los Angeles, who studies the different ways people and computers solve abstract problems. He was used to building neural networks that had specific reasoning capabilities bolted on. But GPT-3 seemed to have learned them for free.

Last month Webb and his colleagues published an article in Nature, in which they describe GPT-3’s ability to pass a variety of tests devised to assess the use of analogy to solve problems (known as analogical reasoning). On some of those tests GPT-3 scored better than a group of undergrads. “Analogy is central to human reasoning,” says Webb. “We think of it as being one of the major things that any kind of machine intelligence would need to demonstrate.”

What Webb’s research highlights is only the latest in a long string of remarkable tricks pulled off by large language models. For example, when OpenAI unveiled GPT-3’s successor, GPT-4, in March, the company published an eye-popping list of professional and academic assessments that it claimed its new large language model had aced, including a couple of dozen high school tests and the bar exam. OpenAI later worked with Microsoft to show that GPT-4 could pass parts of the United States Medical Licensing Examination.

And multiple researchers claim to have shown that large language models can pass tests designed to identify certain cognitive abilities in humans, from chain-of-thought reasoning (working through a problem step by step) to theory of mind (guessing what other people are thinking). 

Such results are feeding a hype machine that predicts computers will soon come for white-collar jobs, replacing teachers, journalists, lawyers and more. Geoffrey Hinton has called out GPT-4’s apparent ability to string together thoughts as one reason he is now scared of the technology he helped create

But there’s a problem: there is little agreement on what those results really mean. Some people are dazzled by what they see as glimmers of human-like intelligence; others aren’t convinced one bit.

“There are several critical issues with current evaluation techniques for large language models,” says Natalie Shapira, a computer scientist at Bar-Ilan University in Ramat Gan, Israel. “It creates the illusion that they have greater capabilities than what truly exists.”

https://www.technologyreview.com/2023/08/30/1078670/large-language-models-arent-people-lets-stop-testing-them-like-they-were

Worthy of Recognition and Praise – Jose Andres

Why is the media so focused on the most despicable, vile, self-serving garbage in society (rhymes with Melon Husk) when humble, dedicated people like Jose Andres actually works to help people in need?

Before chef José Andrés became famous for World Central Kitchen, he had already scaled the heights of his profession. His new cookbook celebrates the group’s humanitarian impact.

“I remember this Spanish guy screaming,” said chef-volunteer Karla Hoyos, describing the first time she met chef José Andrés. “He had just come from a meeting with FEMA [the US emergency management agency], and he was furious. And I thought, ‘Oh, no, no, nooo…’.” She shakes her head emphatically. “I am not going to deal with this person. I don’t care who he is.”

It was September 2017, shortly after Hoyos had arrived in Puerto Rico following Hurricane Maria, the deadly storm that devastated the island, killing nearly 3,000 people, making most roads impassable and knocking out 80% of the power grid. Several days earlier, Andrés had touched down with a team from his non-profit, World Central Kitchen (WCK), which he founded in 2010 after returning from Haiti where he fed survivors of a catastrophic earthquake. The organisation originally emphasised longer-term programmes – such as supporting nutritional training for young mothers – but after Maria, its efforts now focus on deploying an army of culinary first responders to feed people during and after the world’s worst disasters, natural or otherwise.

https://www.bbc.com/travel/article/20230911-jos-andrs-the-man-who-created-an-army-of-culinary-first-responders

IEEE Ethical Design Initiative

A three-year effort by hundreds of engineers worldwide resulted in the publication in March of 2019 of Ethically Aligned Design (EAD) for Business, a guide for policymakers, engineers, designers, developers and corporations. The effort was headed by the IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems (A/IS), with John C. Havens as Executive Director, who spoke to AI Trends for an Executive Interview. We recently connected to ask how the effort has been going. Here is an update.

EAD First Edition, a 290-page document which Havens refers to as “applied ethics,” has seen some uptake, for example by IBM, which referred to the IEEE effort within their own resource called Everyday Ethics for AI  The IBM document is 26 pages, easy to digest, structured into five areas of focus, each with recommended action steps and an example. The example for Accountability involved an AI team developing applications for a hotel. Among the recommendations was: enable guests to turn the AI off, conduct face-to-face interviews to help develop requirements; and, institute a feedback learning loop.

The OECD (Organization for Economic Cooperation and Development) issued a paper after the release of an earlier version of EAD attesting to the close affinity between the IEEE’s work and the OECD Principles on AI. The OECD cited as shared values “the need for such systems to primarily serve human well-being through inclusive and sustainable growth; to respect human-centered values and fairness; and to be robust, safe and dependable, including through transparency, explainability and accountability.”

Self-Evolving Artificial Intelligence?

Teaching algorithms to create novel algorithms…

Artificial intelligence (AI) is evolving—literally. Researchers have created software that borrows concepts from Darwinian evolution, including “survival of the fittest,” to build AI programs that improve generation after generation without human input. The program replicated decades of AI research in a matter of days, and its designers think that one day, it could discover new approaches to AI.

“While most people were taking baby steps, they took a giant leap into the unknown,” says Risto Miikkulainen, a computer scientist at the University of Texas, Austin, who was not involved with the work. “This is one of those papers that could launch a lot of future research.”

Building an AI algorithm takes time. Take neural networks, a common type of machine learning used for translating languages and driving cars. These networks loosely mimic the structure of the brain and learn from training data by altering the strength of connections between artificial neurons. Smaller subcircuits of neurons carry out specific tasks—for instance spotting road signs—and researchers can spend months working out how to connect them so they work together seamlessly.

In recent years, scientists have sped up the process by automating some steps. But these programs still rely on stitching together ready-made circuits designed by humans. That means the output is still limited by engineers’ imaginations and their existing biases.

So Quoc Le, a computer scientist at Google, and colleagues developed a program called AutoML-Zero that could develop AI programs with effectively zero human input, using only basic mathematical concepts a high school student would know. “Our ultimate goal is to actually develop novel machine learning concepts that even researchers could not find,” he says.

Trustworthy AI Framework

An interesting article on business challenges with artificial intelligence.

Artificial intelligence (AI) technology continues to advance by leaps and bounds and is quickly becoming a potential disrupter and essential enabler for nearly every company in every industry. At this stage, one of the barriers to widespread AI deployment is no longer the technology itself; rather, it’s a set of challenges that ironically are far more human: ethics, governance, and human values.

As AI expands into almost every aspect of modern life, the risks of misbehaving AI increase exponentially—to a point where those risks can literally become a matter of life and death. Real-world examples of AI gone awry include systems that discriminate against people based on their race, age, or gender and social media systems that inadvertently spread rumors and disinformation and more.

Even worse, these examples are just the tip of the iceberg. As AI is deployed on a larger scale, the associated risks will likely only increase—potentially having serious consequences for society at large, and even greater consequences for the companies responsible. From a business perspective, these potential consequences include everything from lawsuits, regulatory fines, and angry customers to embarrassment, reputation damage, and destruction of shareholder value.

Yet with AI now becoming a required business capability—not just a “nice to have”—companies no longer have the option to avoid AI’s unique risks simply by avoiding AI altogether. Instead, they must learn how to identify and manage AI risks effectively. In order to achieve the potential of human and machine collaboration, organizations need to communicate a plan for AI that is adopted and spoken from the mailroom to the boardroom. By having an ethical framework in place, organizations create a common language by which to articulate trust and help ensure integrity of data among all of their internal and external stakeholders. Having a common framework and lens to apply the governance and management of risks associated with AI consistently across the enterprise can enable faster, and more consistent adoption of AI.

MLOps and DevOps

This article briefly outlines how as Machine Learning (ML) becomes more a part of corporate solutions, the need for MLOps is going to become more critical.

The term MLOps refers to a set of techniques and practises for data scientists to collaborate operations professionals.. MLOps aims to manage deployment of machine learning and deep learning models in large-scale production environments.

The term DevOps comes from the software engineering world and is concerned with developing and operating large-scale software systems. DevOps introduces two concepts: Continuous Integration (CI) and Continuous Delivery (CD). DevOps aims to shorten development cycles, increase deployment velocity and create dependable releases.

AI BS

or Artificial Intelligence Bull Shitake

There are a lot of claims being made, and as this article points out, not many of them are supported by strong evidence/math.

In Rebooting AI, Ernie Davis and I made six recommendations, each geared towards how readers – and journalists – and researchers might equally assess each new result that they achieve, asking the same set of questions in a limit section in the discussion of their papers:


Stripping away the rhetoric, what does the AI system actually do? Does a “reading system” really read?


How general is the result? (Could a driving system that works in Phoenix work as well in Mumbai? Would a Rubik’s cube system work in opening bottles? How much retraining would be required?)


Is there a demo where interested readers can probe for themselves?


If AI system is allegedly better than humans, then which humans, and how much better? (A comparison is low wage workers with little incentive to do well may not truly probe the limits of human ability)


How far does succeeding at the particular task actually take us toward building genuine AI?


How robust is the system? Could it work just as well with other data sets, without massive amounts of retraining? AlphaGo works fine on a 19×19 board, but would need to be retrained to play on a rectangular board; the lack of transfer is telling.

Reading Is Good For Your Brain

Science has found that reading is essential for a healthy brain. We already know reading is good for children’s developing noggins: A study of twins at the University of California at Berkeley found that kids who started reading at an earlier age went on to perform better on certain intelligence tests, such as analyses of their vocabulary size.


Other studies show that reading continues to develop the brains of adults. One 2012 Stanford University study, where people read passages of Jane Austen while inside an MRI, indicates that different types of reading exercise different parts of your brain. As you get older, another study suggests, reading might help slow down or even halt cognitive decline.Science has found that reading is essential for a healthy brain. We already know reading is good for children’s developing noggins: A study of twins at the University of California at Berkeley found that kids who started reading at an earlier age went on to perform better on certain intelligence tests, such as analyses of their vocabulary size.


Other studies show that reading continues to develop the brains of adults. One 2012 Stanford University study, where people read passages of Jane Austen while inside an MRI, indicates that different types of reading exercise different parts of your brain. As you get older, another study suggests, reading might help slow down or even halt cognitive decline.

https://www.popsci.com/read-more-books

 

And it doesn’t seem to matter if it is a physical book, an e-reader or an audio book (although the audio book has a slightly different impact on the brain).

 

As for audiobooks, the research so far has found that they stimulate the brain just as deeply as black-and-white pages, although they affect your gray matter somewhat differently. Because you’re listening to a story, you’re using different methods to decode and comprehend it. With print books, you need to provide the voice, called the prosody—you’re imagining the “tune and rhythm of speech,” the intonation, the stress on certain syllables, and so. With audio, the voice actor provides that information for you, so your brain isn’t generating the prosody itself, but rather working to understand the prosody in your ears.

Clever ‘AI’ or Poor Definition?

These types of articles seem to come down to the insatiable need for writers to sensationalize things that they don’t necessarily understand.

For example, in the scenario outlined in the article, it is unlikely that the ‘AI’ (aka computer algorithm) was self aware and said to itself “hey, I have a comprehensive understanding of humans and their capabilities, so I will modify myself to ‘cheat’ at this task in a way that a human would find difficult to detect”.

More likely is that the algorithm was poorly defined and the brute force computational model (aka ‘AI’) found a way to ‘solve’ the problem in a way that wasn’t contemplated by the software developer.

This clever AI hid data from its creators to cheat at its appointed task

Feed Shark

Personal Data as an Asset

There is a well worn axiom in business that ‘data should be treated as a corporate asset’.  This is, of course, very true and the advances in data science and ‘big data’ are giving the potential for that data to become even more valuable.

This got me thinking about how personal data should be thought about in the same way.  Think about all the data generated from what you watch, what you listen to, where you visit, what you review, data from wearables, etc.  All of this data is consumed and analyzed by 3rd parties currently, but what if individuals were able to take control of, what is, after all, their data.

Would this give rise to data science companies marketing algorithms directly to consumers (much like pharmaceutical companies market drugs directly)?  Could it also give rise to the equivalent ‘data quackery’ similar to the natural supplements and homeopathic industry?  That is, junk algorithms that, at their most benign, do no harm and at their worst incent you to dangerous courses of action?

Would there also be a new industry for ‘personal data scientists’ (like financial councilors or tax advisers) that would help you assess all of the data assets you have and how to best combine or leverage them with third parties to your best benefit (and not just the benefit of 3rd parties)?  Wouldn’t it be great to have some control over the hundreds of arbitrage-like transactions that go on behind the scenes when you are waiting for a page to load on a commercial web site via browser setting that allow you to control what information about you gets shared (and with companies).