How Rome standardizes logistics data, and why that’s something new.

Rome is building the first global system of record for logistics departments in F500 enterprises (so-called shippers). We do this by bypassing the fragmented operational systems of shippers and their service providers, mining data from raw shipping documents instead.

To make this work, we need a fully automated system capable of achieving three things:

Reading Documents: Reading fields and tables from arbitrary shipping documents and contracts without pre-programmed templates.
Standardizing Data: Mapping that extracted data into a canonical data format.
Matching to ERP Data: Matching this data with core ERP data, despite errors (either in the source data, or in our system that maps this data).

1: Reading Documents

The first task is probably the least unique part of our system, and largely solved by existing OCR models together with a simple LLM pass to correct for the small number of issues that do remain (mostly related to capturing table contents accurately).

Try Google’s OCR Demo with this real-world sample doc — the OCR is not perfect, but nothing an LLM pass can’t fix.

expeditors.pdf

We also use an LLM to capture and standardize semantics of more unstructured documents, such as carrier contracts.

2: Standardizing Data

Autonomously standardizing data is much harder. Note that while some OCR models (such as Google’s) implement some basic data mapping, this sadly works only in the simplest of cases, such as a field label being given in a different language.

We therefore solve data standardization using an LLM, with the right kind of orchestration logic. In our case, the orchestrator is a rule-based constraint system along with a backtracking solver, which while a bit complicated, works very well in practice. Effectively, this rule-based engine is an under-constrained representation of all possible logistics operations, in which the LLM acts as a scoring heuristic for finding a feasible solution that best conforms to the semantics of the data.

The key insight here is that without LLMs, a data engineer would have to map out the *exact “*constraints” for representing data in a given target format, whereas now with LLMs, we can afford to be quite loose, and rely on the LLM to make up for the slack due to its general “understanding” of reality.

We have so far found that earlier generations of LLMs (e.g. gpt-3.5-turbo) have questionable performance on this task, but later generations perform almost on par with a knowledgeable human.

This LLM-as-a-data-engineer is probably the most immediate “why now” moment for our technology. Don’t take that from me, take that from Ryan Peterson of Flexport: Flexport has tried, and failed, to automate data capture from the vast array of shipping document formats for years — until LLMs came along: