Discussion regarding the impact of Artificial Intelligence on society has been dominated by the emergence of Large Language Models into public awareness over the last 18 months. While there have been many advances in adjacent research areas, none have captivated the public’s imagination or anxieties to the same degree as generative models in textual and graphical domains. Lloyd’s report about transforming the cyber landscape explores how GenAI could be used by threat actors and cyber security professionals and highlights its potential impacts on cyber risk.

Major leaps in the effectiveness of Generative AI and Large Language Models have dominated the discussion around artificial intelligence. Given its growing availability and sophistication, the technology will inevitably reshape the cyber risk landscape.

Due to the rapidity of advances in AI research and the nature of the highly dynamic cyber environment, analysis of the consequences these tools may have for cyber perils has been limited

Beinsure Media summarise the key highlights from Report about Large Language Models landscape, the transformation of cyber risk, the considerations for business and insurance and the ways in which Lloyd’s will take action to develop solutions that build greater cyber resilience.

Lloyd’s has been exploring the complex and varied risks associated with AI since developing the world’s first autonomous vehicle insurance in 2016.

We’re all early on in exploring the world of AI, its risks and opportunities – but our market is designed to bring expertise together to underwrite the unknown.

Dr Kirsten Mitchell-Wallace – Lloyd’s Director of Portfolio Risk Management

Lloyd’s is committed to working with insurers, startups, governments and others to develop innovative products and intelligent policy guiderails that can support the development of this important technology and create a more resilient society (see Artificial Intelligence Becomes an Unexpected Risk for Insurance).

Artificial Intelligence and Large Language Models

Artificial Intelligence and Large Language Models

Approximately 6 years ago, a seminal paper was published by Google Research introducing a novel algorithm for encoding, representing, and accessing sequential data with complex structure. This machine, dubbed ‘Transformer’, would underpin almost all language, vision, and audio based generative machine learning approaches by 2023.

Early models (circa 2018) were limited in their capability, but progressively scaling up the computing power and dataset size resulted in rapid advances, culminating in the release of ChatGPT, a web portal interface to GPT-3.5, which was released less than 18 months ago (Nov ‘22), to considerable public interest and concern (see How Can AI Technology Change Insurance Claims Management?).

Since November, notable events include the release of GPT-4 (March ‘23) – exhibiting similar capability to humans across a battery of benchmarked tasks, Google’s Bard (March ’23) and completely open-source equivalents by Meta (March, July ‘23).

AI model governance, financial barriers, guard-rails

AI model governance, financial barriers, guard-rails
Source: Lloyd’s

The rise of powerful generative models brings with it tremendous opportunity for innovation, but also introduces significant risks of harm and misuse.

It is fair to ask why despite the claims of advanced capabilities of these tools, few material impacts on the cyber threat landscape seem to have occurred. The answer has so far been that the industry focus on safety has prevented widespread misuse, as well as economic considerations (see How Does AI Technology Impact on Insurance Industry?).

“AI Safety” is a term without a consensus definition, referring to several related and interlinked areas, which can be classified in three broad categories:

  • (A) Autonomy and advanced capabilities – calls for oversight and control of “systems which could pose significant risks to public safety and global security”
  • (B) Content generated by the models – potentially leading to issues with privacy, copyright, bias, disinformation, public over-reliance, and maybe more
  • (C) Malicious use of the models – leading to harm or damage for people, property, tangible and intangible assets

The first sense of AI Safety (A) has received increasing attention in 2023, with governments allocating resources to understanding the risks involved. The UK government has created a ‘Frontier AI Taskforce’ consisting of globally recognised industry experts, tasked with advising and shaping policy pertaining to the creation of powerful models.

AI model governance, financial barriers, guard-rails

However, despite the growth in interest and investment, the nature and extent of the risk posed by these systems is still unclear.

Due to the lack of quantitative or even qualitative information on this topic, it will not be considered further in this report, but is an area which it will be important to monitor as the situation develops.

Research labs, commercial enterprises, and policymakers have focused on understanding safety in the sense of (B), which related mainly to issues around bias, privacy, fairness, and transparency. All serious issues, that are rightly being the subject of active research, and that also have potential consequences for policies beyond Cyber or Tech E&O, but that are however beyond the scope of this report.

The remaining pressing concern is the question linked to (C): “How can the risk of harm or damage arising from human actors intentionally using these models maliciously be mitigated?” Broadly, there are three mechanisms which have underpinned the safety apparatus curtailing malicious use of Generative Artificial Intelligence technology.

Key elements of model governance for enterprises and research groups producing LLMs include:

  • Output artifacts of the model training processes (known as model ‘weights’) are kept secret and not released to the public. Possessing only model code and having access to the computing hardware is insufficient to run the models. The weights of these models are kept as closed as possible to create commercial and regulatory moat, and to prevent misuse
  • Model training and inference (serving requests) takes place on private computing infrastructure, with internal details opaque to end users
  • Setting and following rules for monitoring AI models in areas like quality, performance, reliability, security, privacy, ethics, and accountability
  • Application of governance principles throughout the entire lifecycle of AI models: training, analysis, release (if applicable), deprecation

The key outcome of model governance is preventing the public from having oversight-free access to emerging disruptive technologies until adequate safety controls can be enacted: technological, regulatory, legal, or otherwise (see Key Benefits of Innovative AI & Machine Learning Technologies).

Costs or hardware requirements for training and running large models

Costs or hardware requirements for training and running large models
Source: Lloyd’s

The process of training, fine-tuning, or performing inference with large generative models is computationally intensive, requiring specialised computing hardware components.

Inference tasks on these models have less exorbitant requirements but still have until recently required access to a datacentre, with prohibitive costs for most threat actors.

However, recent developments have driven these costs down, as will be discussed in the next section.

The consequences of this have been the inability of the public, including small research labs or universities, to train or run their own large models, restricting them to much less capable versions. All access to ‘frontier-grade’ generative models has been through the large labs (OpenAI, Meta, Anthropic, Google), and is subject to their strict governance, oversight, and safeguards.

Generative AI models and tooling

Generative AI models and tooling
Source: Lloyd’s

An important consequence of controlling the training process of LLMs and restricting public access to the models through custom interfaces is the ability to apply strict controls to their usage in accordance with the governance safety principles of the hosting organisation.

All large commercial models to-date except for those released by Meta have ensured access was closely safe-guarded and monitored through specialised interfaces like ChatGPT or Bing Chat.

Commercial state-of-the-art LLM training involves a safety pipeline with several key elements:

  • Strict curation of the input training data to ensure minimal toxic, illegal, or harmful content can be seen by the model
  • Specialised ‘fine-tuning’ techniques involving human curators who provide feedback on potential model responses
  • Adversarial testing with domain experts assessing capability of model, especially with respect to emergent behaviours
  • Layers of evaluations and mitigations ensuring the models adhere to safety principles
  • Strict curation of user-interface access: requests and responses are screened and logged via web interfaces, dangerous requests result in revocation of access and potentially local authorities

If users do not have full access to the models and all internal components, it is impossible to circumvent these restrictions in any meaningful way; while some ‘jailbreak prompts’ may allow a soft bypass, it is ineffective for very harmful requests. Likewise, users cannot bypass screening mechanisms if forced to interact with the models through online portals.

Several advanced techniques have been developed which have dramatically driven down the computational requirements for training, fine-tuning, and running inference for these models. Hundreds of LLMs now exist in the wild for a variety of tasks, many of which can be run locally on commodity hardware.

As of September 2023, it is possible to run a LLM with capability equivalent to GPT3.5 on a consumer grade hardware such as a MacBook M2, completely locally (without internet connection).

This means that all safeguards detailed above can be completely circumvented: models can be adjusted to answer all requests regardless of harm, and this can be done in completely sealed, cheap local computing environments, without any oversight.

Transformation of cyber risk

The following illustrative framework explores how Gen AI tools could be used by threat actors or cyber security professionals and highlights some potential impacts on cyber risk.

While there is no definitive list of components which drive or enable the formation of cyber crime campaigns, it is possible to consider the following factors which influence the frequency and severity of cyber threats in predictable ways and assess the potential impact of emerging LLM technology on each of them.

Drivers of cyber threat

Drivers of cyber threat
Source: Lloyd’s

The existence of software or hardware vulnerabilities to exploit is a requirement for almost all forms of threat actor action, except for purely social engineering-based approaches. It is time consuming to analyse code bases for significant vulnerabilities which can lead to exploits and requires significant expertise.

Vulnerability discovery

Vulnerability discovery
Source: Lloyd’s

Usage of LLMs fine-tuned for code analysis to identify exploitable programming errors has the potential to drive the ‘cost- per-vulnerability’ down by orders of magnitude relative to human investigation by performing at-scale scans of open-source repositories.

LLMs enable automated vulnerability discovery, even in domains which are very challenging for humans, such as:

  • Embedded micro-code and firmware
  • Decompiled proprietary binaries in closed source enterprise software
  • Hardware device drivers

A larger pool of vulnerabilities to choose from grants a greater flexibility in design of exploits, choice in targets, and campaign methodologies.

Successful attacks require a series of vulnerabilities to be exploited, allowing attackers to progressively gain initial access to their target, create a foothold and traverse the systems, and finally create an impact.

Regardless of the specifics of these factors, having more surface in a target system makes every step easier, cheaper, and more economical.

  • AI-enhanced dynamic analysis tools for discovering vulnerabilities in ‘live’ software environments or networks could be a force multiplier for very skilled actors
  • LLM-powered malware can run on physically unobtrusive, portable hardware like a Raspberry Pi device.

While capabilities of malware on low-power devices will be more limited, these tools could significantly increase the risk associated with physical access vectors

While security professionals and vendors will likely look to utilise similar LLM-enhanced tooling defensively in areas like threat intelligence, incident response, and monitoring and detection, there remain fundamental asymmetries that may provide threat actors an advantage.

Threat actors and their organisations will likely have greater incentives and flexibility to construct highly customised tools for narrowly focused augmentation tasks.

The potential financial rewards for novel vulnerability discovery or exploitation provide strong motivation to explore even obscure and highly specialised targets, and the risks involved with their activities result in an extremely high cost of failure.

Campaign planning and execution

Campaign planning and execution
Source: Lloyd’s

Risk-reward analysis

Risk-reward analysis
Source: Lloyd’s

Single points of failure

Single points of failure
Source: Lloyd’s

As general-purpose tools which have strong ‘comprehension’ of human generated content across modalities, it is reasonable to expect LLM integration and impact to occur across many distinct levels of society, with the size of units effected (political, organisational, economic, and cultural) growing proportionately with the capabilities of the tools.

Threat actors will potentially be able to attack this new layer of scaffolding in an already densely connected world, creating new opportunities for large accumulations or catastrophes.

  • Widespread integration brings with it a multiplicity of systemic coupling risks, both direct and indirect: Provider concentration risk, with the emergence of several monopolistic providers of (legal) LLM tooling for individuals and enterprises (OpenAI, Google, Microsoft) acting as additional single-points-of-failure.
  • Common source datasets for LLM training create risk of vulnerabilities embedded in models accidentally or intentionally via dataset poisoning.
    • Potential for AI generated code with common vulnerabilities mass generated by coding-assistance tools
    • Potential for wide-spread aberrant behaviour of LLMs triggered by innocuous inputs and services utilising them
  • Use of LLM-derived algorithms to control large centralised systems, such as industrial or financial systems, could result in unpredictable failure modes with large footprints of exposure
  • Model bias stemming from fundamental inductive biases in algorithm architecture or data may result in systemically correlated decisions, processes, or output across industries where discovery of these correlations is difficult or impossible
  • Potential geopolitical tensions arising from control of semiconductor manufacturing capacity, advanced research, or skilled workers amplify uncertainty

Considerations for business and insurance

Considerations for business and insurance

The available evidence, as discussed in the previous sections, allows us to describe the likely impact of Artificial Intelligence and Large Language Models on the frequency and severity of cyber-related losses, providing a strong basis for businesses and the insurance industry to carefully assess the potential impacts.

Responding to the new risk landscape

  • Insurance: The potential for a broader set of businesses to be subjected to attacks places a greateremphasis on closing protection gaps for currently underserved audiences, such as SMEs
  • Business: It will be increasingly important for businesses to invest in the mapping of their critical functions and open-up conversations around cyber defence and restoration capabilities and business continuity planning beyond the risk and information security functions. Organisations using Gen AI should also be able to detail the procedures around how they use it and rely upon it, to evidence with transparency any potential operational risks
  • Government: Cross-industry working groups around cyber security could offer a platform to shareinformation and learnings in a trusted way. Likewise, collaborating on supply chain failovers could be helpful to reduce the disruption to the economy if a business is impacted
  • Society: Educating society on cyber hygiene practices, such as zero trust or multi factorauthorisation, can reduce susceptibility to social engineering. In addition, education programmes for young people could help to instil good practice and foster the right mindset in the next generation

A new cyber threat landscape

A new cyber threat landscape

Overall, AI has the potential to act as an augmentation of threat actor capability, enhancing the effectiveness of skilled actors, improving the attractiveness of the unit cost economics, and lowering the barrier to entry.

It is likely to mean that there will be more vulnerabilities available for threat actors to exploit, and that it will be easier for them to scout targets, construct campaigns, finetune elements of the attacks, obscure their methods and fingerprint, exfiltrate funds or data, and avoid attributability.

All these factors point to an increase in lower-level cyber losses, mitigated only by the degree to which the security industry can act as a counterbalance.

  • Initial access vectors which rely on human targets making errors of judgement (spear phishing, executive impersonation, poisoned watering holes, etc) are likely to become significantly more effective as attacks become more targeted and finetuned for recipients
  • Attacks are likely to reach broader audiences due to lower cost of target selection and campaign design, meaning the absolute number of losses, and the potential severity of each loss could grow
  • Industrial or operational technology attacks are likely to become more common as automation uncovers vulnerabilities
  • Embedding AI into software could create entirely new initial access vectors for threat actors to exploit, resulting in larger surface area of attack, and consequently more claims
  • The industrialised production of synthetic media content (deepfakes) poses significant challenges for executive impersonation, extortion, and liability risks

Though more companies will be vulnerable to cyber attacks and there will be more security flaws that threat actors can exploit, it is uncertain if this will lead to an increase in highly targeted attacks on specific companies, an increase in broad attacks aimed at many companies, or some other mixed outcome.

The increased number of potential targets and vulnerabilities creates the potential for growth in both focused and widespread cyber campaigns.

Overall, it is likely that the frequency, severity, and diversity of smaller scale cyber losses will grow over the next 12-24 months, followed by a plateauing as security and defensive technologies catch up to counterbalance.

Cyber catastrophes

Cyber catastrophes

Cyber campaigns tend to be designed with specific objectives and aim to maximise returns for the perpetrators, so most threat actors have a strong incentive to keep their actions concealed and their attacks contained.

Catastrophes in cyber occur, for the most part, because the mechanisms put in place by the perpetrators to keep the campaign under control have failed.

The exception to this is state-backed, hostile cyber activity, which includes campaigns designed to cause indiscriminate harm and destruction. It is important to look at the two types of events separately, and distinguish between manageable cyber catastrophes and state-backed, hostile cyber activity.

There is evidence to suggest that the AI -enhancement of threat actor capabilities detailed in the previous section could increase the frequency of manageable cyber catastrophes. However as the mechanism of action is indirect, the magnitude of any increase is likely to be small.

There are several factors driving the occurrence of manageable cyber catastrophes considering AI augmentation of threat actor capabilities:

  • The frequency of manageable cyber catastrophes may increase as campaigns are designed to target a broader set of business, coupled with some automation of attacks
  • AI-enhancements are also likely to result in better and more effective designs of controls for cyber campaigns. This would allow threat actors to develop more targeted campaigns, meaning that the overall increase in frequency for catastrophes is likely to be lower than the increase for smaller scale losses
  • There is evidence of concentration of LLM services, creating a new tier of cloud provider. This new breed of cloud providers would in itself be vulnerable to failures, therefore increasing the frequency of catastrophes associated to Single Point of Failure

The last point deserves some more context. The emergence of LLM services creates an opportunity for threat actors to monetise their attacks in novel ways.

A concentration of LLM services in turn creates fertile ground for large accumulations, or in other words catastrophes, akin to existing service provider failure scenarios, but with potentially slightly different, and more severe, effects than is possible today.

In conclusion, it is highly probable that the frequency of manageable cyber catastrophes will moderately increase. The risk is very unlikely to sharply escalate without massive improvements in AI effectiveness, which current industry oversight and governance make improbable; this is an area where an increased focus from regulators may be helpful.

The increases in catastrophe risk will more likely be gradual based on the steady but incremental progress in AI capabilities that can reasonably be anticipated.

State-backed, hostile cyber activity

State-backed, hostile cyber activity, which includes campaigns designed to cause indiscriminate harm and destruction, gives rise to systemic risks which require a different pricing and aggregation approach.

The effects of Gen AI on this type of systemic risk will surface in the augmentation of tooling and the automation of vulnerability discovery, both of which could enhance existing means to intentionally cause harm and destruction.

It is conceivable that the efforts to discover new exploits could concentrate on high impact targets, particularly industrial technology.

The conclusion is that cyber weapons are likely to become more effective, in both destructive power and espionage capabilities.

However, it is unclear to what extent the proliferation of advanced capabilities will increase the risk of a major catastrophe happening.

The trend is clearly upwards, but once again the human factor will come into play, and the mere existence of these capabilities might not directly translate into deployment, let alone indiscriminate deployment.


AUTHOR: Dr Kirsten Mitchell-Wallace – Lloyd’s Director of Portfolio Risk Management

You May Also Like