Applying machine learning to the freight industry (traintracks.io)
127 points by chungchungz on Sept 30, 2016 | hide | past | favorite | 53 comments

Applying machine learning to any industry.

Here's a checklist that I just came up with:

  Step 1. Pick an industry. Any industry.
  Step 2. Find a problem that can be formulated as a function.
  Step 3. Is that function non-trivial? If not, go back to Step 1.
  Step 4. List all the input parameters for that function.
  Step 5. Is any of them accurately observable? If not, focus on that parameter and go back to Step 2.
  Step 6. Apply some ML to it. (Choice of the tech wouldn't matter that much.)
  Step 7. Had some improvement?

It seems like they're following the even more basic/old school formula first and then adding on the ML:

    Step 1. Pick a domain with loads of trivial, low-value paperwork
    Step 2. Convert the paperwork into online forms 
    Step 3. Plug your forms into existing player's tech
    Step 4. Battle to get adoption

  Step 5. Add some blockchain technology to it or call it "disrupting".
  Step 6. Collect venture millions

Step 7. Get Bored.

Step 8. Create Space Rocket Company.

Step 9. Go to Mars.

Edit: Missed a vital step :-

Step 6.5 Sell Company for Billions

If the problem is painful enough, it doesn't matter if the function is trivial.

Now write a startup AI and raise your millions

To be fair, that seems a pretty accurate list. You will find yourself stuck in step two quite often.

Wrong. What about an industry where data is either hard or unethical to gather such as the medical industry?

Isn't that step 5? I took that to be: "Do you have all of the data necessary to formulate an answer? If not, focus on solving that problem or eliminate the need for the data you can't get by going back to step two and picking a formula that doesn't need the data you can't get."

I'm not sure if I just read a post detailing just that: https://backchannel.com/you-too-can-become-a-machine-learnin...

Step 5. Is any of them accurately observable? If not, implement a Kalman filter.

This is missing the fundamental step which is collecting data.

I think you missed step 5.

Step 8. Tell everyone how innovative you are Step 9. Constant Blog post on your "Big Data" problems

My super basic understanding of deep learning is that it lets you skip step 5. Is that true?

No. Garbage in, garbage out is true for deep learning as well.

A lot of the statements made in the article are rather questionable.

Applying "digital learning" to saying schedules and saying that is a data culture? If the input is bad, improve the input of the data. What is stated regarding outdated schedules highly depends on the company. Some companies might be bad, but why try to improve bad data?

It seems more that they don't know about the different schedules you usually have. One is the proforma. It tells you that there should be a weekly call on some weekday at some time. But then for practical purposes you look at the estimated times the vessel arrives.

In the article they pretend that a vessel suddenly departed a day early without anyone knowing. That's really not how that works. Cargo needs to be delivered to the terminal. You're not going to silently advance such a vessel and not be able to fill it up with cargo. Cargo which then stands at the terminal for 6-7 days and causing problems for the next call (too much cargo).

Then this one: > We’re still pretty much the only company that tries digitalization of an end to end shipment.

What about https://www.inttra.com/ ?

That the shipping industry as a whole is very inefficient is known. But it still seems like a lot of statements made in this article are rather questionable.

> We’re still pretty much the only company that tries digitalization of an end to end shipment.

They are totally BS'ing there:

Diversified Transportation Services (DTS) has had digital end-to-end shipping since at least 2009. That platform has made them a fortune. (Source: I worked with them for 5+ years and selected them as our preferred 3PL vendor because it was all web-based from end-to-end):


So I went to the site and you can't make a booking there. So you have to email in for a quote. Where is the digitalization there?

or maybe the fact you can sent a email through a form on the website. Thats your definition of digitalization in 2016?

This is kind of a meta-comment, but one thing I love about this site is the amount of domain knowledge that comes out of the woodwork when articles like this are posted.

Think this commenter needs to get their facts right first.

First inttra doesn't allow shipper to make the booking it is purely a tool for forwarders. What Kontainers are doing is allow the shipper to do that them selves, and because of this behaviour is actually a pretty big deal. So inttra is not a competitor. Kontainer is removing this step, your making a comment on something you obviously know nothing about.

You should read the article properly before making comments. A schedule doesn't suddenly depart. What the article is saying is that the the data doesn't get into the distribution system fast enough but you can still ring up to get that schedule.

Flexport, which is a YC company, seems to be doing the same thing. Their guy is on here regularly and should have something to say about this.

Flexport, though, is an real freight forwarder. They take responsibility for end to end delivery as a common carrier. If you shipped through Flexport and it's stuck on a bankrupt Hanjin ship somewhere, it's Flexport's job to get it unstuck. Does Kontainer do that? They seem to be more like a price and schedule comparison site / lead generation system.

Xeneta does it as well. They published an article a few months ago explaining how they use NLP to help their sales team with lead qualification: https://medium.com/xeneta/boosting-sales-with-machine-learni...

Xeneta is more on data analytics of freight rates and doesn't allow bookings. So not a competitor to Kontainers

Kontainers does the same.

That was an interesting article. In the past I have worked with bulk commodity shipping (i.e ore + coal) rather than commercial freight. In my case the company owned the berth the and the unloaders. So it was a similar but slightly different problem space. The biggest issue was obtaining a reliable schedule from the ships. When I worked there the harbor master communicated directly with ships captain about expected arrival date etc and it was all stored on spreadsheets.

There is probably room in this space for some ML smarts similar to what is in the article (extrapolating arrival times based on weather, ships condition, Panama Canal data etc.) and forecasting ahead for a more accurate arrival date.

The benefit for doing this is because of something called "Demurrage" which comes into play. Demmurage is a special type of charge that is incurred when there are unloading delays. It wasn't unheard of because of the absence of accurate scheduling for multiple ships to arrive within close proximity of each other. When this happened the ships would be forced to lined up outside of the harbor either waiting for tug boat availability to tow them into berth or waiting for actual space on the berth. They'd be sitting there racking up demurrage charges. There is a pay off in optimising berth scheduling, whether its large enough to build a company around I'm not sure...

> There is a pay off in optimising berth scheduling, whether its large enough to build a company around I'm not sure...

It's a small but quite lucrative market. We're talking about less than 10 players for 10M+ contracts.

Do you know if demurrage charges are applied for vessels sitting offshore as storage?

As I understand it the demurrage charges apply when a third party doesn't receive the goods (or the use of the ship) by the expected time, so presumably isn't applied if the ship is intended to hold the goods offshore over a particular period.

There's a big market in scheduling and monitoring systems for commercial aviation but the economics are very different (delays equal very expensive fuel burn and have safety and environmental implications, schedules are tight and turnaround and connection times measured in minutes, time slots at certain airports are multimillion dollar tradeable assets) as is the level of regulation.

I'm not sure in that case. It may be contract specific. If the ship is waiting on harbour control to let it in to unload demurrage definitely applies. If it has not been formally handed over to harbour control it may not.

In container shipping, demurrage charges also apply to the individual containers. If a container misses a sailing and sits on the dock too long waiting for the next one, it starts to rack up charges on the order of $100 or $150 a day.

What happened to the gold standard of logistics industry, which is Operations Research (OR)?

OR has essentially been customized to this domain long before generalized ML appeared on the scene - I can't imagine some of-the-self sci-kit learn libraries have improved on this much.

It's being used in the industry heavily. The term is extremely unsexy. In my mind, I'm an OR analyst. But if people ask me my job, I try to keep a straight face and say I'm a data scientist with an expertise in prescriptive analytics.

There are literally dozens of us.

Operations Research is totally unsexy. It still works better than ever before; I'm shipping a work project based on stochastic programming pretty soon. But we'll probably call it "AI" in marketing materials.


I work in OR, specifically Computational Logistics.

There is plenty of room for operational improvement in all industry, all the time. Articles like this paint us as big and stupid because that's how the "disruption" people see all industry.

Often it is not the optimization that is lacking but the will of disparate companies to co-operate.

An example - the Port of Rotterdam barge system is horribly inefficient for the barge operators because the individual shippers see minor gains. The challenge is getting the shippers to agree and co-operate, not to machine learn the best routing for the barges.

I think OR only works if the models are linear.

Huh! what ?

I think you are thinking of linear programming, and OR is way more than that. OR is the first chapter of an OR book.

Too late to edit, meant LP is the first chapter in an OR book.

Yes, and there are several ways of linearizing models for OR

(however I think the MIP people waste too much time looking for the perfect linear model and the best solution instead of using things like simulated annealing which might give a very good solution in 10% of the time)

There are nonlinear solvers available but they are slower and not guaranteed to converge on a globally optimal solution.

Your statement is not universally true, it depends on the nonlinearity.

It's correct in the general case. A solver algorithm can find a globally optimum solution for some nonlinear functions, but it cannot be guaranteed to find a globally optimum solution for arbitrary nonlinear functions. That would be like solving P = NP or the Halting Problem.

Freight Industry is not that simple, though. In some ports, it envolves a huge infrastructure, country laws (inspections from health authorities, customs and even navy sometimes) and so on.

In that case, 20 calls are just the beginning. You can't ship a matchbox without hiring a company to do the (sometimes dirty) job for you. Data culture worth nothing in places like that.

Spot on. I love ML, I love the potential. But as a former logistics systems manager I would NEVER trust anything that applied "fuzzy" logic to the already frangible processes that are involved in working in (and depending on) international logistics and 3PL.

Customs brokerage is the big thing for me - there is no substitute for a great customs brokerage house - those guys and gals work magic and have contacts at every level in even the most obscure countries in the world.

A potential problem with this business model is that small-medium enterprises who lack logistics departments (target market for the product) tend to buy their goods Delivered Duty Paid, which means the seller arranges the transportation. If the seller is in China, they are required by law to appoint a licensed freight forwarder agent to place the ocean booking for them. So you really cannot play ball in the eastbound transpacific trade until you have a licensed operation in the PRC--which requires forming a Joint Venture corporation with a Chinese company.

And it sounds like this company wants to streamline online booking, SLI and B/L creation and shipment tracking, but I don't think I saw anything about handling ordering of or payment for the goods themselves, so it's still up to the buyer and seller to arrange purchase orders, commercial invoices, letters of credit, etc.

"We’re still pretty much the only company that tries digitalization of an end to end shipment."


HN interface comment: This list is displayed badly in Android Chrome. I get an invisible 24-char-wide horizontally scrolling box to see the items in. Definitely suboptimal for reading. fovc's comment below in this thread also demonstrates this issue in the list.

Does anyone else have this issue?

If you would like to email us a screenshot at hn@ycombinator.com we can take a look.

We detached this subthread from https://news.ycombinator.com/item?id=12610520 and marked it off-topic.

Why not get one of the hacker news reader apps? I use one called 'Materialistic'.

Why should we need an app to read a website?

We shouldn't. But one is free to make that choice, while awaiting for the possible improvements that would resolve the website issues praguing them.


