Not an expert here.
I am not sure how 3D the videos in the article are. IMO they are 3D in a pixar/animated sort of way.
But I have very recent first hand experience of creating a video for our startup's Facebook post with Minimax image-to-video inference, from an image of our animated avatar character.
...And yes, first the videos were bad quality with lots of inconsistencies, but after adding "animated" to the prompt, in front of the "man" word, the result was pretty great already on the first try! Which I then ended up even using. (you can even check it here if interested https://fb.watch/xRC-fptexM/)
Perhaps it should be self-evident, but still, it was not to me. :)
Edit. I guess my point was also that the animated character in the video ended up being somewhat 3D as well.
Yeah, I see what you mean! When we say 3D here, we mean working with actual 3D scenes—models, depth, and lighting—rather than the '3D movie' style like Pixar. The goal is to use 3D to control AI generations more precisely, so things stay consistent across frames instead of AI hallucinating every frame from scratch."
Your experience with Minimax sounds cool! Adding 'animated' to the prompt helping consistency makes sense—AI models often struggle with structure, so any guidance helps.
Well you work in the field for a while, and you accumulate anecdotes of colleagues dropping tactical sleep(5000)'s so they can shave some milliseconds of latency each week and keep the boss happy.
I love those stories but I could never do that with a straight face. However, the AI field is such an uphill battle against all the crap that LinkedIn influencers are pushing into the minds of the C-suite... I feel it's okay to get a bit creative to get a win-win here ;)
Yes, that's what it seems to me also, that often a RAG or similar is branded as an "agent". Though I personally understand an LLM agent as something that takes input x to use in LLM inference and then uses the output from that inference to create a new input for another LLM inference that includes the first output and so on, and repeats this >1 times.
That's an LLM workflow and not an agent if it's on rails created by a predefined workflow and doesn't make tool calls, or does not have any choice in tool calls. The tool calls are what give it agency.
Yeah. An agentic workflow is nothing but implementation of execution of a bunch of tasks and each task takes a little bit of help from the LLM. Honestly I believe this is applicable to companies that have workflows having a lot of manual tasks and automation of these workflows could be easier with the help of LLM agents.
Interesting! To me 80% hitrate sounds actually pretty good and awesome if it actually improves productivity, though understandably not something that could be left on it's own devices.
I had no idea about Make.com or n8n, they seem interesting. Thanks for the tip! Will check them out.
My take is that if the LLM outputs text for humans to read, that's not an agent. If it's making API calls and doing things with the results, that's an agent. But given the way "AI" has stretched to become the new "radium" [1], I'm sure "agent" will shortly become almost meaningless.
The definition of agent is blurry. I prefer to avoid that term because it does not mean anything in particular. These are implemented as chat completion API calls + parsing + interpretation.
Seems like a great service!
We could definitely consider this as an viable option for our startup's application.
I am not an expert regarding the OSM data and maps in general, but how customizable is your library, can I somehow relatively easy add housenumbers of the addresses from OSM data on the buildings in the map?
This is totally customizable and the house numbers are already present in the original data, it's just not displayed in the default styles. You can customize the style by using the Maputnik editor, house numbers are visible in the "Inspector" view.
https://maputnik.github.io/editor?style=https://tiles.openfr...
But I have very recent first hand experience of creating a video for our startup's Facebook post with Minimax image-to-video inference, from an image of our animated avatar character.
...And yes, first the videos were bad quality with lots of inconsistencies, but after adding "animated" to the prompt, in front of the "man" word, the result was pretty great already on the first try! Which I then ended up even using. (you can even check it here if interested https://fb.watch/xRC-fptexM/)
Perhaps it should be self-evident, but still, it was not to me. :)
Edit. I guess my point was also that the animated character in the video ended up being somewhat 3D as well.