No need for more scare stories about the looming automation of the future. Artists, designers, photographers, authors, actors and musicians see little humour left in jokes about AI programs that will one day do their job for less money. That dark dawn is here, they say.
Vast amounts of imaginative output, work made by people in the kind of jobs once assumed to be protected from the threat of technology, have already been captured from the web, to be adapted, merged and anonymised by algorithms for commercial use. But just as GPT-4, the enhanced version of the AI generative text engine, was proudly unveiled last week, artists, writers and regulators have started to fight back in earnest.
“Picture libraries are being scraped for content and huge datasets being amassed right now,” says Isabelle Doran, head of the Association of Photographers. “So if we want to ensure the appreciation of human creativity, we need new ways of tracing content and the protection of smarter laws.”
Collective campaigns, lawsuits, international rules and IT hacks are all being deployed at speed on behalf of the creative industries in an effort, if not to win the battle, at least to “rage, rage against the dying of the light”, in the words of Welsh poet Dylan Thomas.
Poetry may still be a hard nut for AI to crack convincingly, but among the first to face a genuine threat to their livelihoods are photographers and designers. Generative software can produce images at the touch of the button, while sites like the popular NightCafe make “original”, data-derived artwork in response to a few simple verbal prompts. The first line of defence is a growing movement of visual artists and image agencies who are now “opting out” of allowing their work to be farmed by AI software, a process called “data training”. Thousands have posted “Do Not AI” signs on their social media accounts and web galleries as a result.
A software-generated approximation of Nick Cave’s lyrics notably drew the performer’s wrath earlier this year. He called it “a grotesque mockery of what it is to be human”. Not a great review. Meanwhile, AI innovations such as Jukebox are also threatening musicians and composers.
And digital voice-cloning technology is putting real narrators and actors out of regular work. In February, a Texas veteran audiobook narrator called Gary Furlong noticed Apple had been given the right to “use audiobook files for machine learning training and models” in one of his contracts. But the union SAG-AFTRA took up his case. The agency involved, Findaway Voices, now owned by Spotify, has since agreed to call a temporary halt and points to a “revoke” clause in its contracts. But this year Apple brought out its first books narrated by algorithms, a service Google has been offering for two years.
The creeping inevitability of this fresh challenge to artists seems unfair, even to spectators. As the award-winning British author Susie Alegre, a recent victim of AI plagiarism, asks: “Do we really need to find other ways to do things that people enjoy doing anyway? Things that give us a sense of achievement, like writing a poem? Why not replace the things that we don’t enjoy doing?”
Not a fan of AI: singer-songwriter Nick Cave. Photograph: Simona Chioccia/Shutterstock
Alegre, a human rights lawyer and writer based in London, argues that the value of authentic thinking has already been undermined: “If the world is going to put its faith in AI, what’s the point? Pay rates for original work have been massively diminished. This is automated intellectual asset-stripping.”
The truth is that AI incursions into the creative world are just the headline-grabbers. It is fun, after all, to read about a song or an award-winning piece of art dreamed up by computer. Accounts of software innovation in the field of insurance underwriting are less compelling. All the same, scientific efforts to simulate the imagination have always been at the forefront of the push for better AI, precisely because it is so difficult to do. Could software really produce paintings that entrance or stories that engage? So far the answer to both, happily, is “no”. Tone and appropriate emotional register remain hard to fake.
Yet the prospect of valid creative careers is at stake. ChatGPT is just one of the latest AI products, alongside Google’s Bard and Microsoft’s Bing, to have shaken up copyright legislation. Artists and writers who are losing out to AI tend to talk sorrowfully of programmes that “spew rubbish” and “spout out nonsense”, and of a sense of “violation”. This moment of creative jeopardy has arrived with the huge amount of data now available on the web for covert harvesting rather than due to any malevolent push. But its victims are alarmed.
Analysis of the burgeoning problem in February found that the work of designers and illustrators is most vulnerable. Software programs such as Midjourney, Stable Diffusion and DALL.E 2 are creating images in seconds, all culled from a databank of styles and colour palettes. One platform, ArtStation, was reportedly so overwhelmed by anti-AI memes that it requested the labelling of AI artwork.
At the Association of Photographers, Doran has mounted a survey to gauge the scale of the attack. “We have clear evidence that image datasets, which form the basis of these commercial AI generative image content programs, consist of millions of images from public-facing websites taken without permission or payment,” she says. Using the site Have I Been Trained which has access to the Stable Diffusion dataset, her “shocked” members have identified their own images and are mourning the reduction of the worth of their intellectual property.
The opt-out movement is spreading, with tens of millions of artworks and images excluded in the last few weeks. But following the trail is tricky as images are used by clients in altered forms and opt-out clauses can be hard to find. Many photographers are also reporting that their “style” is being mimicked to produce cheaper work. “As these programs are devised to ‘machine learn’, at what point can they generate with ease the style of an established professional photographer and displace the need for their human creativity?” says Doran.
For Alegre, who last month discovered paragraphs of her prize-winning book Freedom to Think were being offered up, uncredited by ChatGPT, there are hidden dangers to simply opting out: “It means you are completely written out of the story, and for a woman that is problematic.”
Alegre’s work is already being misattributed to male authors by AI, so removing it from the equation would compound the error. Databanks can only reflect what they have access to.
“ChatGPT said I did not exist, although it quoted my work. Apart from the damage to my ego, I do exist on the internet, so it felt like a violation,” she says.
“Later it came up with a pretty accurate synopsis of my book, but said the author was some random bloke. And, funnily enough, my book is about the way misinformation twists our worldview. AI content really is about as reliable as checking your horoscope.” She would like to see AI development funding diverted to the search for new legal protections.
Fans of AI may well promise it can help us to better understand the future beyond our intellectual limitations. But for plagiarised artists and writers, it now seems the best hope is that it will teach humans yet again that we should doubt and check everything we see and read.
Typo blamed for Microsoft Azure DevOps outage in Brazil • The Register
Microsoft Azure DevOps, a suite of application lifecycle services, stopped working in the South Brazil region for about ten hours on Wednesday due to a basic code error.
On Friday Eric Mattingly, principal software engineering manager, offered an apology for the disruption and revealed the cause of the outage: a simple typo that deleted seventeen production databases.
Mattingly explained that Azure DevOps engineers occasionally take snapshots of production databases to look into reported problems or test performance improvements. And they rely on a background system that runs daily and deletes old snapshots after a set period of time.
During a recent sprint – a group project in Agile jargon – Azure DevOps engineers performed a code upgrade, replacing deprecated Microsoft.Azure.Managment.* packages with supported Azure.ResourceManager.* NuGet packages.
The result was a large pull request of changes that swapped API calls in the old packages for those in the newer packages. The typo occurred in the pull request – a code change that has to be reviewed and merged into the applicable project. And it led the background snapshot deletion job to delete the entire server.
“Hidden within this pull request was a typo bug in the snapshot deletion job which swapped out a call to delete the Azure SQL Database to one that deletes the Azure SQL Server that hosts the database,” said Mattingly.
Azure DevOps has tests to catch such issues, but according to Mattingly, the errant code only runs under certain conditions and thus isn’t well covered under existing tests. Those conditions, presumably, require the presence of a database snapshot that is old enough to be caught by the deletion script.
Mattingly said Sprint 222 was deployed internally (Ring 0) without incident due to the absence of any snapshot databases. Several days later, the software changes were deployed to the customer environment (Ring 1) for the South Brazil scale unit (a cluster of servers for a specific role). That environment had a snapshot database old enough to trigger the bug, which led the background job to delete the “entire Azure SQL Server and all seventeen production databases” for the scale unit.
The data has all been recovered, but it took more than ten hours. There are several reasons for that, said Mattingly.
One is that since customers can’t revive Azure SQL Servers themselves, on-call Azure engineers had to handle that, a process that took about an hour for many.
Another reason is that the databases had different backup configurations: some were configured for Zone-redundant backup and others were set up for the more recent Geo-zone-redundant backup. Reconciling this mismatch added many hours to the recovery process.
“Finally,” said Mattingly, “Even after databases began coming back online, the entire scale unit remained inaccessible even to customers whose data was in those databases due to a complex set of issues with our web servers.”
These issues arose from a server warmup task that iterated through the list of available databases with a test call. Databases in the process of being recovered chucked up an error that led the warm-up test “to perform an exponential backoff retry resulting in warmup taking ninety minutes on average, versus sub-second in a normal situation.”
Further complicating matters, this recovery process was staggered and once one or two of the servers started taking customer traffic again, they’d get overloaded, and go down. Ultimately, restoring service required blocking all traffic to the South Brazil scale unit until everything was sufficiently ready to rejoin the load balancer and handle traffic.
Various fixes and reconfigurations have been put in place to prevent the issue from recurring.
“Once again, we apologize to all the customers impacted by this outage,” said Mattingly. ®
What are the current trends in Ireland’s pharma sector?
SiliconRepublic.com took a look at PDA Ireland’s Visual Inspection event to learn about Ireland’s pharma sector and its biggest strengths.
Ireland’s pharmaceutical stakeholders gathered in Cork recently to learn the latest developments and regulatory changes in the sector.
The event was hosted by the Irish chapter of the Parenteral Drug Association (PDA), a non-profit trade group that shares science, technology and regulatory information to pharma and biopharma companies.
The association held a Visual Inspection event in Cork last month, where speakers shared their outlooks on the industry, the regulatory landscape and tips on product investigation.
PDA Ireland committee member Deidre Tobin told SiliconRepublic.com that one goal of the event was to get bring the industry together and help SMEs engage with top speakers.
“The mission of PDA is really to bring people together in industry and to have that network sharing, that information gathering so that we’re all consistent, we all have the same message,” Tobin said.
Ireland’s advantages
Ireland has grown to become a hub of leading pharma companies over the years, with many multinational companies setting up sites here. By 2017, 24 of the world’s top biotech and pharma companies had made a home for themselves in Ireland.
The sector also remains active in terms of merger and acquisition deals. A William Fry report claimed Pharma accounted for 12pc of all Irish M&A deals by volume in 2022.
Ruaidhrí O’Brien, head of UK and Ireland sales at Körber Pharma and a PDA Ireland member, said the country has a “wealth of experience” across various types of pharmaceutical production, such as API bulk and solid dosage production.
O’Brien claimed there’s also been growth in the “liquid fill finish area”, which relates to completed pharma products such as vaccines. During the Covid-19 pandemic, Pfizer confirmed its Irish operations were being used to manufacture its vaccine.
O’Brien also said Ireland has “skilled people” that are in senior levels within companies, which he feels is why existing companies continue to invest and why “we have amazing investments from all the global leaders”.
Regulatory changes
One speaker at the PDA Ireland Visual Inspection event was John Shabushnig, the founder of Insight Pharma Consulting LLC. He spoke about current and upcoming regulation impacting the global sector.
Shabushnig said he sees the overall industry understanding of what it can and can’t do “continuing to improve”. He also said there is better alignment between regulators and industry now “than I saw 10 or 20 years ago”.
Shabushnig spoke positively about the regulatory landscape overall and couldn’t think of any “big misses” in terms of industry ignoring regulation. But he did note that some developing areas in the industry are “a bit unknown”.
“Advanced therapies, cell and gene therapies, there are some unique challenges on inspecting those products that we’re kind of learning together at this point,” Shabushnig said.
But Shabushnig said there are also “big opportunities” ahead with new tools that can be taken advantage of. One example he gave was using AI for automated visual inspection, which Shabushnig described as a “very exciting tool”.
10 things you need to know direct to your inbox every weekday. Sign up for the Daily Brief, Silicon Republic’s digest of essential sci-tech news.
Prof Saurabh Bagchi from Purdue University explains the purpose of AI black boxes and why researchers are moving towards ‘explainable AI’.
For some people, the term ‘black box’ brings to mind the recording devices in airplanes that are valuable for postmortem analyses if the unthinkable happens. For others, it evokes small, minimally outfitted theatres. But ‘black box’ is also an important term in the world of artificial intelligence.
AI black boxes refer to AI systems with internal workings that are invisible to the user. You can feed them input and get output, but you cannot examine the system’s code or the logic that produced the output.
Machine learning is the dominant subset of artificial intelligence. It underlies generative AI systems like ChatGPT and DALL-E 2. There are three components to machine learning: an algorithm or a set of algorithms, training data and a model.
An algorithm is a set of procedures. In machine learning, an algorithm learns to identify patterns after being trained on a large set of examples – the training data. Once a machine-learning algorithm has been trained, the result is a machine-learning model. The model is what people use.
For example, a machine-learning algorithm could be designed to identify patterns in images and the training data could be images of dogs. The resulting machine-learning model would be a dog spotter. You would feed it an image as input and get as output whether and where in the image a set of pixels represents a dog.
Any of the three components of a machine-learning system can be hidden, or in a black box. As is often the case, the algorithm is publicly known, which makes putting it in a black box less effective. So, to protect their intellectual property, AI developers often put the model in a black box. Another approach software developers take is to obscure the data used to train the model – in other words, put the training data in a black box.
The opposite of a black box is sometimes referred to as a glass box. An AI glass box is a system whose algorithms, training data and model are all available for anyone to see. But researchers sometimes characterise aspects of even these as black box.
That’s because researchers don’t fully understand how machine-learning algorithms, particularly deep-learning algorithms, operate. The field of explainable AI is working to develop algorithms that, while not necessarily glass box, can be better understood by humans.
Thinking Outside The Black Box
In many cases, there is good reason to be wary of black box machine-learning algorithms and models. Suppose a machine-learning model has made a diagnosis about your health. Would you want the model to be black box or glass box? What about the physician prescribing your course of treatment? Perhaps she would like to know how the model arrived at its decision.
What if a machine-learning model that determines whether you qualify for a business loan from a bank turns you down? Wouldn’t you like to know why? If you did, you could more effectively appeal the decision, or change your situation to increase your chances of getting a loan the next time.
Black boxes also have important implications for software system security. For years, many people in the computing field thought that keeping software in a black box would prevent hackers from examining it and therefore it would be secure. This assumption has largely been proven wrong because hackers can reverse engineer software – that is, build a facsimile by closely observing how a piece of software works – and discover vulnerabilities to exploit.
If software is in a glass box, software testers and well-intentioned hackers can examine it and inform the creators of weaknesses, thereby minimising cyberattacks.
Saurabh Bagchi is professor of electrical and computer engineering and director of corporate partnerships in the School of Electrical and Computer Engineering at Purdue University in the US. His research interests include dependable computing and distributed systems.