Fine-tuning an LLM on CTI reports for fun and profit BSidesVienna 0x7e8

Fine-tuning an LLM on CTI reports for fun and profit
.ical

11-23, 10:10–10:40 (Europe/Vienna), Track 1 (Dachssal)

LLMs turn out to be highly practical for summarising and extracting information from unstructured Cyber Threat Intelligence (CTI) reports. However, most models were not trained specifically for understanding the lingo of CTI. We will present our custom, local LLM, fine-tuned for CTI purposes. But how would we know if it's any good? That only makes sense with a CTI text benchmark dataset. Trying to solve these two challenges was quite a journey. Set-backs guaranteed. We will share our findings.

Many CTI practitioners and companies experimented with LLMs for extracting information from unstructured CTI reports in the last year. Often, the dream is to automate the analyst's job to correctly identify, copy & paste TTPs, threat actors and relationships from the report and to convert it into STIX.

Alas, off-the-shelf LLMs often fail at this task (GPT-4-turbo being already pretty good at the submission time). But there is another caveat: the requirements for IT security often demand that data remains on-premise or at least in a virtual server which is fully and only under the control of the organisation's IT team. For that we need local LLMs (as opposed to cloud bases SaaS/FaaS solutions such as openai.com's API). But how to achieve good results with local LLMs ? Can we beat openai?

To address the CTI text summarisation and information extraction problem, we

propose an open source CTI LLM benchmark dataset which can be used to compare different LLMs and prompts
a fine-tuned custom CTI LLM model ("neuroCTI") and
evaluate it (as well as other LLMs) against the benchmark dataset and
incorporate the infosec community in our endeavour

The model is freely available to the public.

See also: slides

Jürgen Brandl

Jürgen Brandl is a senior cyber security analyst and has 10 years of experience working in incident response, protecting both governmental and critical infrastructure from cyber attacks. In his current role, he is researching and advocating for the need to use AI to face the emerging threat landscape.

Aaron Kaplan

Background: Computer Science TU Vienna and Mathematics Univ. of Vienna.

Currently working for DIGIT-S.2 where he focuses on how AI can help IT security and Cyber Threat Intelligence Analysis.

Prior to joining DIGIT-S.2, Aaron was employee #4 of CERT.at, the national CERT of Austria from 2008-2020.

At CERT.at, he co-developed and founded the IntelMQ Incident Response automation framework (intelmq.org).
During his time at CERT.at he held multiple additional roles. Amongst others, he was member of the board of directors of the global Forum for Incident Response and Security Teams (FIRST.org) between 2014-2018.

He is a frequent speaker at (IT security) conferences such as Blackhat, hack.lu, FIRST or Falling Walls, amongst others.

He is the founder of the FunkFeuer (http://www.funkfeuer.at) free wireless mesh community ISP in Austria. Funkfeuer, received international attention as a role model for bottom-up networking. Amongst others an article in Scientific American [1]

Aaron likes to come up with ideas which have a strong positive benefit for (digital) society as a whole and which scale up.

Fine-tuning an LLM on CTI reports for fun and profit .ical 11-23, 10:10–10:40 (Europe/Vienna), Track 1 (Dachssal)

Fine-tuning an LLM on CTI reports for fun and profit
.ical

11-23, 10:10–10:40 (Europe/Vienna), Track 1 (Dachssal)