Hello everyone,
I am part of the team of aviation professionals on a mission to fix one of the oldest problems in aviation industry – NOTAMs. This is a collaborative project for the benefit of the industry, most of the participants including myself are doing this pro bono publico.
TL;DR
NOTAM – Notices to Airmen. Almost a century old system of passing the aeronautical information to pilots, dispatchers and everyone involved in flight planning.
Problem
The briefing system, or NOTAM is a worldwide standard, and presents information in a coded, upper case, incredibly un-human-friendly format, is overloaded with irrelevant information, and creates 100 page briefing packages for flight crews that are simply impossible to read and understand. For every pilot and passenger alike, this creates unacceptable risk.
Solution (as we imagine it)
Create an extra layer of the information analysis and classification using AI that will present NOTAMs in a concise and clear way for the reader. We think using LLMs for this task. Afterwards, processed information may be added to the briefing package.
We need your recommendation as to what is the best way to utilize LLMs for these tasks.
Some more information
For detailed information about the effort please visit https://fixingnotams.org/ and
We have developed NOTAM classification matrix. We are using OpenAIs API and by providing the following input we get quite good results.
return [
[
‘role’ => ‘system’,
‘content’ => ‘You are a NOTAM Librarian and categoriser who can decode and organise NOTAMs. Your replies are always json objects with no formatting.’,
],
[
‘role’ => ‘assistant’,
‘content’ => "Here is the array of NOTAM Tags, each tag has three columns: ‘Code’, ‘Tag Name’, ‘Tag Description’: TAGS REMOVED HERE BUT INCLUDED IN ACTUAL REQUEST,
],
[
‘role’ => ‘user’,
‘content’ => <<<‘EOL’
I will give you a json_encoded NOTAM message. Each notam should be identified using theid
field. The content uses thetext
field.
Create a json object exactly as follows:
{
“id”: The notamid
field,
“type”: Choose the most logical Tag for this NOTAM from the list of previous defined tags,
“code”: The code for the selected Tag Name from the list of previous defined codes,
“summary”: In very simple English only, explain the NOTAM in a maximum of seven words. Use sentence case and limit abbreviations,
}
Do not use any display formatting and ensure a valid json object is returned.
EOL,
],
]
I understand that there are several ways of tackling the issue, but we are not sure how to proceed with LLMs, what technique/approach should we use. Should we create our own dataset and finetune LLM using this dataset (using LORA)? Should we go for RAG (using langchain)? But then again, the problem that we have is not the reliability of the data source or it being out of date. Rather we want to make sure that LLM can correctly a) analyze NOTAM, b) make it smaller and easily readable, and c) add correct tag to the NOTAM.
Your advice is much appreciated
Thank you!