March 20, 2023
|
Armilla Review #4

By Dan Adamson, Karthik Ramakrishnan and Philip Dawson

Share This Post

The Armilla Review is a weekly digest of important news from the AI industry, the market, government and academia tailored to the interests of our community: regarding AI evaluation, assurance, and risk.

A March 2023 report published by the Geneva Association (GA), a trade association representing the world’s largest insurers, found that 75% of its executives view AI as the key digital liability risk, alongside only cloud computing. The report cites the risk of discrimination resulting from incomplete data sets, algorithmic bias, defects and errors of AI systems, as well as generative AI, as significant sources of legal and reputational liability.


Interestingly, while the GA report urges insurers to get ready for AI liability regimes, particularly in light of emerging AI regulatory initiatives in the EU, it highlights the need to prepare for the impacts of the metaverse — a phenomenon that is likely to rely on AI to power its immersive digital experiences (in particular, multimodal generative AI, like GPT-4). In brief, AI liability, and AI insurance, should be top of mind for insurers and businesses for years to come.


Lineups of people being organized and guided by robots // Midjourney

     

In this newsletter, you’ll find:

  • Bing Chat and the Future of Responsible Search?
  • ‘They’ll all go to the US’: What the EU’s AI law means for European startups
  • Microsoft just laid off one of its responsible AI teams
  • How Artists and Writers are Fighting Back Against AI
  • Workplace AI Vendors, Employer Rush to Set Bias Auditing Bar
  • UPRISE: Universal Prompt Retrieval for Improving Zero-Shot Evaluation
  • Stanford Alpaca: A Strong, Replicable Instruction Following Model
  • Generative AI, Explained
  • GPT-4 Technical Report
  • Google Adds GenAI to Workspace
  • Google Cloud gives developers access to its foundation models through PaLM API
  • Anthropic Releases Claude
  • PyTorch 2.0
  • Midjourney V5
  • Microsoft 365 Copilot
     

From the Armilla Blog

My major concern is from a responsible AI perspective. There are many articles and threads, ranging from funny to scary, about Bing Chat going off the rails. Most famously, a New York Times tech reporter, Kevin Roose, who had a long unsettling experience with Bing Chat trying to get him to leave his wife for his one true love (Bing Chat, of course), but others as well.

 

Top Articles

Brussels wants to pass “world-leading” regulation on developing AI — but startups worry it’ll be more hindrance than help.

Microsoft laid off its entire ethics and society team within the artificial intelligence organization as part of recent layoffs that affected 10,000 employees across the company. Time will tell how the layoffs affect Microsoft’s ability to ensure its AI principles and Responsible AI standard are implemented across the company.

No need for more scare stories about the looming automation of the future. Artists, designers, photographers, authors, actors and musicians see little humour left in jokes about AI programs that will one day do their job for less money. That dark dawn is here, they say.

Vendors, auditors, employers, and regulators are each scrambling to establish standards for bias-proofing the AI tools that companies increasingly depend on to make employment decisions.

Universal Prompt Retrieval – a new approach to tune a lightweight and versatile retriever to automatically retrieve prompts to improve zero-shot performance and help mitigate hallucinations.

We are releasing our findings about an instruction-following language model, dubbed Alpaca, which is fine-tuned from Meta’s LLaMA 7B model. We train the Alpaca model on 52K instruction-following demonstrations generated in the style of self-instruct using text-davinci-003. On the self-instruct evaluation set, Alpaca shows many behaviors similar to OpenAI’s text-davinci-003, but is also surprisingly small and easy/cheap to reproduce.

It’s not often we see technologies gain exponential adoption and attention in a very short time frame the same way OpenAI’s ChatGPT has since late 2022. ChatGPT is estimated to have reached 100 million users in just two months. It took Netflix 10 years to reach 100 million users; six and half years for Google Translate; roughly two and a half years for Instagram; and about nine months for TikTok.

We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers.

     

Top Tweets

 
     
     

/GEN

 

A turbulent valley in the midst of a dense forest, cherries, border // Midjourney

Keep Reading

More To Explore

Armilla Review #10

The Armilla Review is a weekly digest of important news from the AI industry, the market, government and academia tailored to the interests of our community: regarding AI evaluation,

Continue Reading ➜

Issue #5

The Armilla Review is a weekly digest of important news from the AI industry, the market, government and academia tailored to the interests of our community: regarding AI evaluation,

Continue Reading ➜