Stronger, Faster, and Cheaper Log Parsing with LLMs

2024年06月10日
  • 简介
    日志解析是将原始日志消息转换为结构化格式的过程,是自动分析大型软件系统日志的重要初始步骤。传统的日志解析器通常依赖启发式或手工特征,可能无法很好地推广到不同的日志来源,或需要进行大量的模型调整。最近,一些日志解析器利用了大型语言模型(LLM)强大的生成能力。然而,它们严重依赖示例演示,导致LLM调用方面的开销很大。为了解决这些问题,我们提出了LogBatcher,这是一种基于LLM的低成本日志解析器,不需要训练过程或标记数据。为了利用日志数据的潜在特征并减少开销,我们通过聚类将日志分成几个分区。然后我们执行缓存匹配过程,将日志与先前解析的日志模板进行匹配。最后,我们通过对每个分区的一组日志进行批处理,为LLM提供更好的专门用于日志解析的提示上下文。我们对16个公共日志数据集进行了实验,结果表明LogBatcher对于日志解析是有效和高效的。
  • 图表
  • 解决问题
    Log parsing is an important step for automated analysis of logs, but traditional parsers rely on heuristics and handcrafted features. Recent LLM-based parsers require demonstration examples and result in substantial overhead. The paper proposes LogBatcher, a cost-effective LLM-based log parser that requires no training process or labeled data.
  • 关键思路
    LogBatcher divides logs into partitions through clustering, performs cache matching to match logs with previously parsed templates, and provides LLMs with better prompt context by batching a group of logs from each partition.
  • 其它亮点
    Experiments on 16 public log datasets show that LogBatcher is effective and efficient for log parsing. The paper also discusses the limitations of LogBatcher and suggests future work to improve its performance.
  • 相关研究
    Related work includes traditional log parsers, recent LLM-based log parsers, and other clustering and batching techniques for log analysis. Some relevant papers include 'LogMine: Fast Pattern Recognition for Log Analytics' and 'DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning'.
PDF
原文
点赞 收藏 评论 分享到Link

沙发等你来抢

去评论