Feb 5, 2026

When AI Is Allowed to Guess: The Hidden Risks in Customs Classification

In March 2025, US Customs and Border Protection completed 71 audits that identified $310 million in duties and fees owed to the US government. If you break that down, it averages out to more than $4 million per audit. For a massive conglomerate, a bill like that is a bad quarter. For a mid-sized importer, it is an extinction event.

This statistic should wake up every compliance manager in the country. It signals that CBP has taken the gloves off. They are using better data and sharper targeting tools to find revenue leaks. We are nearly a year past that March 2025 report, and the trend has not slowed down. The pressure to move goods faster is crushing. Supply chains are tight. Margins are thin. In this environment, the temptation to use a tool like ChatGPT or Google Gemini to classify your products is overwhelming. It feels like magic. You paste a description, it gives you a code, and you move on.

But there is a catch. When you let a general-purpose AI guess your Harmonized Tariff Schedule (HTS) codes, you are gambling your license on a coin toss.

What HS and HTS Codes Actually Are

To understand the risk, you have to understand the system. Global trade runs on the Harmonized Commodity Description and Coding System. Most people just call it the Harmonized System or HS. It is managed by the World Customs Organization (WCO). This system gives us a standard way to use numbers to describe products.

The first six digits of a code are the same almost everywhere. But then it gets complicated. Individual countries add their own digits to the end. These extra numbers are for their specific tariffs and data tracking. In the United States, we use the Harmonized Tariff Schedule (HTS). An HTS code is usually ten digits long. Those last few digits are critical. They decide your duty rate. They decide if you can use a free trade agreement. They even decide if other agencies like the FDA or FCC need to get involved.

Why Classification Is Harder Than It Looks

If classification were just a matching game, computers would have solved it years ago. You would just find the word "shoe" in a list and pick the number next to it. But that is not how it works. Classification is a legal decision. It is not about language. It is about the law.

The hard part is the grey areas. These are governed by the General Rules of Interpretation (GRIs). Think about a "smart" athletic shoe. It has a sensor in the sole to track your running speed. Is it a shoe? Or is it a radio transmitter?

The answer depends on its "essential character." This is a legal concept. It is subjective. To get it right, you need to understand Section Notes and Chapter Notes deep inside the tariff schedule.

Here are some common headaches:

Composite Goods are tricky: Imagine a kit that has a flashlight, a knife, and a compass inside. You cannot just classify them all separately if they are sold as a set. You have to decide which item gives the set its essential character. That one item dictates the code for the whole box.

Parts versus Accessories: This distinction causes a lot of fights with Customs. You have to tell the difference between a part that a machine needs to work and an accessory that just makes it better. If you get it wrong, the duty rate could jump from zero to 25%.

Material Composition matters: You might have a shirt that is a mix of polyester and cotton. A tiny change in the percentage of each fabric can change the classification completely. If you change the code, you change the duty you owe.

Small errors create big risks. If you mess up the last few digits of an HTS code, you might miss out on a duty-free entry. Or worse, you might accidentally classify your product into a category that gets hit with a 25% tariff under Section 301.

The Problem with "Quick" Solutions like Google and Generic AI

Complexity is high, so many logistics pros take the path of least resistance. They use Google. And now, they use general purpose AI.

We see this workflow all the time. An employee copies a product description. Maybe it is "Stainless steel kitchen sink with sprayer." They paste it into ChatGPT, Gemini, or Microsoft Copilot. Then they ask, "What is the HTS code for this?"

The AI answers immediately. It usually sounds very confident. It might even explain. But this convenience is a trap.

The Hallucination Problem

General AI models are probabilistic engines. They are guessing the next likely word in a sentence. They are trained on the whole internet. They are not logic engines. They are not connected to a live database of laws. Research shows that they suffer from "hallucination." This means the AI makes up information that sounds true but is actually false.

A study in Nature about large language models pointed this out. These tools are fluent. They speak well. But they often cannot tell the difference between a reliable fact and probable fiction. In trade, an AI might invent a Section Note that does not exist. It might cite a Customs Ruling that was revoked five years ago.

There is another issue. General models are rarely updated in real time. The HTS is a living document. The US International Trade Commission (USITC) updates it often. A general model might be trained on data from 2023. It will not know about tariff changes from 2025. If you use it, you are non-compliant from day one.

Real-World Consequences

When a business relies on a guess, bad things happen. It does not matter if the guess came from a human on Google or a bot. The consequences hit the supply chain hard.

Financial liability is the biggest risk: If an importer uses a generic code with a high duty rate, they are wasting money on every shipment. But underpaying is worse. The Customs Modernization Act says importers must use "reasonable care." If you underpay duties, even by accident, you are breaking the law. If CBP audits you and finds a pattern of errors, you face back-duties. You also face heavy penalties and interest.

Your supply chain can stall: CBP uses automated systems to find weird data. Suppose your entry summary lists a code that does not match the description on the cargo manifest. The system will flag it. Your container might get pulled for inspection. A "quick" AI classification can lead to your container sitting at the Port of Long Beach for weeks. The demurrage fees alone will cost more than any compliance software.

Why Domain-Specific Tools Are Different

The trade industry knows accuracy is a problem. That is why we are seeing a move toward domain-specific AI. These tools are not general chatbots, they are engineered for customs compliance.

The Architecture is different: Domain-specific engines usually do not guess answers from the open internet. The AI is forced to look for answers only in a verified library. This includes the current HTS tables, the Explanatory Notes, and the Customs Rulings Online Search System (CROSS).

You can audit the work: A key difference is provenance. You need to know where the information came from. A compliance officer needs to know why a code was chosen. Specialized tools give citations to specific rulings. This lets a human broker validate the logic.

Verified Performance: Vendors are starting to test their models against real standards. Gaia Dynamics' classification engine scored 100% on the classification section of the Customs Broker License Exam (CBLE) twice.

This matters. A general AI might pass a bar exam by memorizing books. But tariff classification requires applying hierarchical logic. Purpose-built engines are proving to be much more reliable at this than general models.

The 2026 Market of Global Trade Management Platform

The market for trade compliance tech has split in two. On one side, you have massive "all-in-one" Global Trade Management (GTM) platforms. They treat classification as just one small module. On the other side, you have specialized engines. These focus entirely on getting the classification right.

Here is a detailed look at the primary players. We will look at both the specialists and the enterprise giants.

  1. Gaia Dynamics
    Gaia is an emerging specialist. They focus on high-stakes accuracy and auditability. They are a domain-specific AI classification engine. Their key strength is verified logic. Unlike general tools, Gaia benchmarks its engine against human expert standards. The company announced its engine scored 100% on the classification section of the Customs Broker License Exam (CBLE) twice. It uses a "Retrieval-Augmented" architecture. This forces the AI to cite specific legal notes and rulings for every code it generates. This provides a transparent audit trail. The limitation is scope. Gaia is a classification and reasoning engine. It does not currently offer logistics execution or broader ERP functions.

2. Thomson Reuters ONESOURCE (Global Trade Management)

This is the traditional market leader for large multinationals. It is especially strong for companies with complex tax and finance needs. Thomson Reuters is an enterprise tax and trade platform. Their key strength is content authority. They employ over 200 in-house analysts who update their "Global Trade Content" database constantly. Their "Smart HS" tool integrates deeply with their tax software. This makes it ideal for companies where customs and finance departments work closely together. However, it has limitations. It is a heavy-duty enterprise solution. Implementation can take months. The price is often too high for mid-market importers.

3. Avalara

Avalara is widely known for sales tax, but they have become a leader in cross-border e-commerce. Their tariff code classification tool is designed for high volume. It works particularly well for retailers with large catalogs of lower-complexity goods. They offer "Self-Serve" and "Managed" options. A standout feature is their ability to use image recognition, you can upload a photo of a product, and the AI helps classify it. It supports over 180 countries. While excellent for retail and e-commerce, it is sometimes viewed as less specialized for complex industrial equipment or chemicals compared to legal-focused tools like Thomson Reuters.

4. iCustoms.ai

iCustoms is a UK-based challenger making waves with its "iClassification" tool. They focus heavily on automating the customs declaration process with AI. They claim 99% accuracy and emphasize "rock solid compliance." The platform offers four distinct ways to classify: manual search, auto-search (AI), hierarchical search (guided questions), and bulk upload. It is designed to be an "all-in-one" solution that handles the declaration filing after classification. As a newer entrant compared to the legacy giants, their historical provenance in the North American market is still growing. 

5. Altana AI

Altana operates differently than a standard software vendor. They built the "Altana Atlas," a dynamic map of the global supply chain. They focus on the "Value Chain" rather than just the code. Their biggest validation is a contract with the U.S. Customs and Border Protection (CBP). CBP uses Altana’s map to help enforce forced labor laws (UFLPA). If the regulator uses it, that is a strong endorsement. They provide deep visibility into multi-tier supplier networks. Their primary focus is often on supply chain visibility, forced labor, and sanctions (the "who" and "where") rather than just the "what" (HS classification), though they handle both.

6. Descartes Systems Group (CustomsInfo)

Descartes is the data backbone of the industry. They own "CustomsInfo," which is the database that powers many other software platforms (including SAP and Oracle). Data volume. If you need raw data, tariff tables, rulings, and regulations for almost any country, they have it. It is the industry standard for data completeness. The user interface is often technical. It is built for brokers and data scientists, not necessarily for the average procurement manager who just needs a quick answer. 

Practical Recommendations & Best Practices

Companies need to modernize their classification process. But they cannot expose themselves to risk. Here are some best practices to follow.

Validate AI Suggestions. You must treat any AI output as a suggestion. It is not a final decision. A qualified human must review the final code. This should be a licensed broker or a compliance specialist.

Require Confidence Scores. You should use tools that provide a confidence score. If the tool is only 60% sure, the item requires manual review. Do not trust a low score.

Log the Rationale. You need a digital paper trail. If a tool suggests a code based on a specific ruling, you must save that citation. If CBP audits you, you need to show you exercised "reasonable care." Showing that you consulted valid sources is a strong defense.

Isolate Borderline Items. You should create a separate workflow for complex goods. Composite items and high-tech components are tricky. Their classification depends on "essential character." These almost always require human legal analysis.

Run Periodic Audits. You should run post-entry audits on a sample of your classifications every year. This helps you catch systemic errors before CBP finds them.

Technology has finally caught up to the complexity of the Harmonized System. It offers importers and brokers powerful new ways to manage efficiency. But there is a big difference between a "likely" answer from a chatbot and a "legal" answer from a domain-specific engine. That difference is what stands between a compliant supply chain and a costly audit.

Tools like Gaia Dynamics provide the necessary guardrails to use AI safely. Ultimately, classification remains a regulatory responsibility. You must verify your data. You must trust your experts. And you must choose your tools with care.