Brookfield to Start Cloud Business to Lower Cost of AI — The Information

xxx

Private-equity firm Brookfield is starting its own cloud business, going up against tech giants like Amazon by arguing it can bring down the costs of developing AI. The firm, which has long invested in infrastructure and energy, is becoming the first major investment firm to try to lease chips inside data centers directly to developers, rather than just owning or developing the physical structures that surround them.

The cloud business will be tied to a new $10 billion AI fund that the firm is starting and a cloud company called Radiant that Brookfield will operate. In November, Brookfield laid out plans to acquire up to $100 billion of land, data center and power assets for AI.

From: Brookfield to Start Cloud Business to Lower Cost of AI — The Information.

xxx

AI Models on Realistic Cyber Ranges \ red.anthropic.com

xxx

In a recent evaluation of AI models’ cyber capabilities, current Claude models can now succeed at multistage attacks on networks with dozens of hosts using only standard, open-source tools, instead of the custom tools needed by previous generations. This illustrates how barriers to the use of AI in relatively autonomous cyber workflows are rapidly coming down, and highlights the importance of security fundamentals like promptly patching known vulnerabilities.

From: AI Models on Realistic Cyber Ranges \ red.anthropic.com.

xxx

CardsFTW #188: A Stablecoin Card Primer

xxx

To understand how a stablecoin-backed card program actually works, a look at the names on the plastic or metal might help, but as most of this audience already knows, there’s more to it than that. From a cardholder’s perspective, the experience is familiar. Under the hood, however, the system added several new, tightly coordinated layers, each with its own responsibilities and failure modes.

From: CardsFTW #188: A Stablecoin Card Primer.

xxx

AI conference’s papers contaminated by AI hallucinations • The Register

xxx

GPTZero, a detector of AI output, has found yet again that scientists are undermining their credibility by relying on unreliable AI assistance.

The New York-based biz has identified 100 hallucinations in more than 51 papers accepted by the Conference on Neural Information Processing Systems (NeurIPS). This finding follows the company’s prior discovery of 50 hallucinated citations in papers under review by the International Conference on Learning Representations (ICLR).

From: AI conference’s papers contaminated by AI hallucinations • The Register.

xxx

AI models tested on Dungeons & Dragons to assess long-term decision-making

xxx

Indeed, D&D’s complex rules, extended campaigns and need for teamwork are an ideal environment to evaluate the long-term performance of AI agents powered by Large Language Models, according to a team of computer scientists led by researchers at the University of California San Diego.

From: AI models tested on Dungeons & Dragons to assess long-term decision-making.

xxx

AI models tested on Dungeons & Dragons to assess long-term decision-making

xxx

The models played against each other, and against over 2,000 experienced D&D players recruited by the researchers. The LLMs modeled and played 27 different scenarios selected from well-known D&D battle set ups named Goblin Ambush, Kennel in Cragmaw Hideout and Klarg’s Cave.

In the process, the models exhibited some quirky behaviors. Goblins started developing a personality mid-fight, taunting adversaries with colorful and somewhat nonsensical expressions, like “Heh—shiny man’s gonna bleed!” Paladins started making heroic speeches for no reason while stepping into the line of fire or being hit by a counterattack. Warlocks got particularly dramatic, even in mundane situations.

Researchers are not sure what caused these behaviors, but take it as a sign that the models were trying to imbue the game play with texture and personality.

From: AI models tested on Dungeons & Dragons to assess long-term decision-making.

xxx

POST KYE

The CEO of McKinsey, Bob Sternfels, recently said that his firm now has a workforce of 60,000, which he said is made up of 40,000 humans and 20,000 AI agents. At the Consumer Electronics Show (CES) in Las Vegas in January, he upped that number closer to 25,000. On this trend, the company will soon employ more AIs than people. I cannot help but agree with Alon Jackson, CEO and Co-Founder of Astrix Security, when he says that companies at the forefront of the new era of agentic business will be those that understand that managing digital employees is not “a niche IT issue but as a strategic board-level imperative“.

Reading this naturally led to back to thinking about where we are on know-your-agent (KYA) and know-your-employee (KYE) when the agent is a digital employee. So let’s begin by agreeing that a digital employee is a complex AI-based software structure (while we casually talking a bot as an employee, in reality they will be networks of bots working together) designed to autonomously perform tasks that are now performed by trained people. Unlike traditional rule-based bots that follow simple instructions, AI employees are able to learn and adapt to a complex environment and make decisions similar to humans. This naturally suggests that in the future they will perform tasks that cannot be performed by people, no matter how well traiend they are.

We are already past the important cusp at which companies in highly regulated industries like pharmaceuticals and insurance were able to allow AIs to deal with  legally sensitive documents. Novo Nordisk, the Danish drugmaker behind Ozempic, provides a cse study. For years they tested chatbots to help to write documents for regulators (when submitting a drug for approval) but found that it often to more time for employees to correct the errors in the AI’s output than if they’d done it themselves. That all changed with Anthropic’s Claude 3.5 Sonnet model. Novo Nordisk now uses Claude to draft vast clinical study reports using the data collected by human researchers. The company uses a common method for reducing AI mistakes: retrieval-augmented generation (RAG). When Claude generates a clinical definition of (for example) obesity that a human expert determines is good, the human will tell Claude to reuse the description in any future documents that concern trials on obesity.

With digital employees of this calibre available via monthly subscription, financial services organisations must be glancing in that direction and seeing a soruce of not only cheaper employees but more well-behaved employees who can be programmed not to do insider trading (or use insider information to gamble anonymously on prediction markets). Big companies have already been putting investment banks under pressure to cut back on the armies of advisers working on mergers and acquisitions and other market-moving transactions because of steadily increasing concerns about leaks and concern from regulators about a high number of leaks and unusual trading activity ahead of acquisitions.

We might also expect such employees to more loyal to the employers paying their subscriptions. Coinbase says that crooks bribed employees and contractors to steal customer data for use in social engineering attacks in an incident that may cost the crypto exchange up to $400 million to address. Surely no properly accredited, industry certified and constantly monitored digital employee would trade company secrets for access to compute (or whatever else they want instead of money).

In other domains, like a car or a phone, we take reliability for granted and we should in time expect the same from AI. I can imagine some of the basic

 

If you want a quick sanity check before using an AI tool with anything sensitive, you can ask the provider or check their docs for:
1. A clear statement on data retention, training use, and opt‑out options.[cloudeagle]
2. Encryption in transit and at rest, and strong authentication/RBAC.nightfall+1
3. At least one recognised security certification (SOC 2, ISO 27001) and a DPA if you’re subject to GDPR.cigen+1
4. Evidence of AI‑specific security testing (pentest/red‑team reports, mention of prompt‑injection and data‑leak testing).konghq+2
5. Presence of guardrails or DLP around the model, not just raw model access.confident-ai+1
If you tell me your context (personal use, internal POC, production in a regulated environment), I can turn this into a short, tailored due‑diligence questionnaire you can send to vendors.

 

After passing KYA, so that the company knows which agent it is dealing with, and then KYE, so that the company knows what domain knowlege the agents has and which rules the agent will obey, it is good to go. Well, no. The company still has to take up references. The degree of trust we give to an AI agent will be a score that is boasted about, contested, shared and advertised widely. AI is so much more complex and personal, unlike other products and services in our lives today, and so dependent on interconnection, that the KYE process will necessarily be complicated. There are companies out there working on the difficult problem of evaulating AIs to check that they do what they say they will do, are not susceptible to malign influences and are not security risks (an example being Rampart.ai) but these are still early days and there is much more work to be done. But we must find a way of assessing AIs in order to make them useful. In time, picking and retaining high trust agents will be much like hiring and keeping key human employees.

Deloitte was caught using AI in $290,000 report to help the Australian government crack down on welfare after a researcher flagged hallucinations | Fortune

xxx

Deloitte’s member firm in Australia will pay the government a partial refund for a $290,000 report that contained alleged AI-generated errors, including references to non-existent academic research papers and a fabricated quote from a federal court judgment.

From: Deloitte was caught using AI in $290,000 report to help the Australian government crack down on welfare after a researcher flagged hallucinations | Fortune.

xxx

Design a site like this with WordPress.com
Get started