SECURE: Benchmarking Generative Large Language Models as a Cyber Advisory

Repository for the paper "SECURE: Benchmarking Generative Large Language Models as a Cyber Advisory," submitted at 40th Annual Computer Security Applications Conference (ACSAC'24).

This paper introduces SECURE (Security Extraction, Understanding & Reasoning Evaluation), a benchmark designed to assess LLMs performance in realistic cybersecurity scenarios. SECURE includes six datasets focussed on the Industrial Control System sector to evaluate knowledge extraction, understanding, and reasoning based on industry-standard sources.

The dataset folder consists of the 6 TSV files corresponding to the 6 tasks with prompt for the large language models and ground truth as correct answer.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Dataset		Dataset
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SECURE: Benchmarking Generative Large Language Models as a Cyber Advisory

About

Releases

Packages

Contributors 2

aiforsec/SECURE

Folders and files

Latest commit

History

Repository files navigation

SECURE: Benchmarking Generative Large Language Models as a Cyber Advisory

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages