well, i know i can just download the app to see what it looks like but i can't right now so would be nice to see some screenshots of the app on the website.
Worth listening to if you are interested in the decisions that go into creating, designing, naming, doing usability testing and promoting a desktop app like Easy Data Transform.
Cool tool. Tried it on an annoying dataset I know well. Three specific requests;
#1. The "Show First 10 Rows" dropdown... nice here would be "Show First 10 MOST FREQUENT Rows" ... helps get a view of the distribution of values
#2. A "Map" transformation - you can give it a list of input values and a list of one more more output values to which the inputs should be mapped. E.g. input values might be "New York", "Peekskill" and "Middletown" which map to "New York State" which can be placed in a new column (like the "If" transform)
#3. Finally because it's Hackernews... a "Function" transform allowing something like a Javascript function to be applied to a column, the output put in another column
>#1. The "Show First 10 Rows" dropdown... nice here would be "Show First 10 MOST FREQUENT Rows" ... helps get a view of the distribution of values
You should be able to do this with a pivot, then a sort. But pivot doesn't work with non-numeric values at present. Next release!
>#2. A "Map" transformation - you can give it a list of input values and a list of one more more output values to which the inputs should be mapped. E.g. input values might be "New York", "Peekskill" and "Middletown" which map to "New York State" which can be placed in a new column (like the "If" transform)
You could do that with 'IF'. But I guess that could be a bit verbose and I should perhaps offer a 'Lookup' transform as well. The table lookup has the advantage that the lookup table can be created/modified by Easy Data Transform.
>#3. Finally because it's Hackernews... a "Function" transform allowing something like a Javascript function to be applied to a column, the output put in another column
Yes, an option to have some sort of scriptable transform would be very useful (even if it is slightly at odds with the "without programming" positioning). I personally loathe Javascipt, but I guess it would be easier to embed than, say, Python or Lua.
I imagine you could do much of this with NiFi - https://nifi.apache.org/, though if your needs are simple, something like this would definitely be much easier to deal with.
I don't know how good this is, but from that page it looks like the "no-code required" approach for data transformation I saw as part of many solutions.
When I write data-transformation code, I always have the feeling that it's often too inter-connected and an approach, like the one this tool follows, would be nicer.
Somehow the only the core idea of using these connected nodes is good, the rest of the UI is too clunky, so I drop down to "real" code again for some nodes and sooner or later the mixing up of nodes and code becomes too cumbersome and I drop down to "real" code for everything.
I love these kinds of tools and this looks very useful indeed, but something about your page triggers my spidey sense. A pricing page or at least a hint at a business model would make that go away. Right now it feels like you’re just trying to get me to run a binary on my computer.
Edit: not saying you’re shady, just that it has a vibe of being shady :)
It is free while we are in beta. Because I hope that is going to result in more feedback. Also I don't feel comfortable charging for something that isn't quite production quality yet. But the plan is to charge an annual sub after it comes out of beta (price undecided).
I can see that might trigger some people to think it is of dubious provenance. Maybe I should put the above on a 'Buy' page?
BTW the software is digitally signed (and notarized on Mac) and we've been selling software online since 2005.
http://oryxdigital.com/
Strongly suggest perpetual licensing unless this is truly a service, i.e., you're offloading processing to the cloud (which frankly would be a deal-breaker for me if I were evaluating this for a client).
If this is a standalone binary, what I'd want to see as a user would be a one-time purchase and an optional annual support plan.
My other two products are perpetual licences with optional paid major upgrades and include support. But the world is changing and a yearly sub is attractive for vendors. Simple and with a more predictable cash flow. It also incentivizes the vendor to keep existing customers happy, rather than always chasing new customers.
Vendors rarely deliver on the promise of continuous improvement quickly enough that I could justify the recurring cost from that standpoint alone. Or I'm already happy with the features, so I'm just paying for feature bloat at that point. (That's definitely been the case for most of the consumer cloud apps I pay for ... which is why I'm slowly migrating back to the desktop in many areas.)
If there are no service features like real-time collaboration, then a recurring revenue model makes even less sense.
I mean, I get it—I've written software that I sell with annual licenses myself. But it depends on cloud services to work, so there are costs to me too. That's I think where it's maybe the place to step back and look at the architecture and whether it's better suited to a web app if recurring revenue is important. Just my two cents....
I disagree - as a vendor I think you should charge based on the value you provide to users, not the costs to run your service. That is to say I don’t think desktop software needs to have a “cloud” piece to it to make it valuable enough to users to justify an annual charge. Users don’t usually care where your app or service runs, just that it solves their pain points quickly and easily.
The question then becomes how you justify that charge, but I think you can legitimately say support and/or new feature development (especially if you allow customers to have some kind of input into that). Having guaranteed support from a company with the technical chops that Andy has would be worth it alone for me in this instance.
Jetbrains have an interesting model with their IDE’s whereby you can fallback to a perpetual desktop license for their products if you don’t want support or updates. Perhaps that’s a nice compromise option.
> you should charge based on the value you provide to users
Yes, but not infinitely, not for a static product. Otherwise I should also be willing to accept infinite punishment for whatever harm my product does as long as it exists.
If I create a hammer, it could still be generating value in a hundred years, but it could also be used to break windows and kill people. So if I deserve continuous payment, shouldn't I also be liable for the damages? Why should I be infinitely rewarded just because my tool had the potential to add value when it also had the potential to do harm?
> Having guaranteed support [...] would be worth it
I agree with you there. But I think it should be optional. It sounds like that's what JetBrains is doing, albeit in a roundabout way.
I pay a sub for some desktop software I use. I don't have a problem with it.
If you estimate that the average user will use the product for N years, you can charge a one-time fee of X or a yearly sub of X/N and make the same amount either way (assuming you got N right and forgetting inflation). A yearly sub is very simple to explain. And people who use the product for longer pay more, which seems fair. The sub also helps to finance the ongoing costs of development and support.
Pretty cool, reminds of how Advanced Renamer handles batch renaming filenames through a visual stack of methods, like sorting, regex replacing, trimming, renumbering, etc. I think that's a really useful thing. There are lots of other weird online formatting tools I've seen over the years that perform things like this, but the experience is pretty poor. I will probably recommend this to my dad.
Congrats on releasing this. I had a quick look at the site and the manual and couldn't see a list of file formats that can be used. For example can I use it with JSON or XML files?
Not yet. Currently it can read delimited text (e.g. CSV) and XLS(X)* and write delimited text. But I do plan to add other input/output formats, depending on feedback.
I need to think a bit about how to flatten an XML/JSON doc into a table and then turn it back into an XML/JSON doc.
(*XLS(X) output currently only works on Windows, because it uses ActiveX. But I plan to have XLS(X) input/output on Windows and Mac at some point.)
I don't see any advantage to making this a SaaS. It would just result in more latency and potential privacy issues.
It is true that a desktop system may not be suitable for transorming million row datasets or processing that is running 24x7 - but that is not the market we are aiming for.
I think the benefit of a SaaS in this case would be:
1. Users always work with the latest version, so you only have 1 version to support
2. It would make monthly pricing an easier sell
But I think there are some downsides here, with an app that is solely about data:
1. If user's data has to flow through it, there are privacy, GDPR and intellectual property concerns (for both the SaaS vendor and customers)
2. Latency, since you're going to have to upload data
3. Possibly issues with bandwidth fees (I think most clouds only charge for egress bandwidth, but users are still going to want to download the processed data)
4. Monthly pricing is a big turn-off for a large segment
It is written in C++/Qt, so it wouldn't be that hard to add a Linux version. But I'm not sure that the market that this is aimed at uses Linux in any appreciable numbers. You are the first to ask!
Also building binaries for Linux is a pain. Which distributions to support?
I was about to ask the same but then I realised I'm not in the target market. The number of "data noobs" on Linux is probably indeed quite low. Fair enough
But would you pay for something like this? Or would you use a free tool such R?
My gut feeling is that Linux users:
a) Don't want to pay for software.
b) Probably have the technical chops to roll their own solution in Python/R/SQL.
I haven't tried the thing yet, but if it made my life easier, I would most certainly pay for and use it. Or get my customers to do so.
Data pipelining, cleaning and feature construction is the most time consuming part of data science. Its almost always a struggle, and the process usually produces fragile and ephemeral code that will need to be rewritten to put into production. If you could provide a labview-like GUI thing to remove some of this drudgery, assuming you could hook it up to database and csv back ends, or if there were a target which could do this, and the result were robust and could be deployed, it would make me much more productive than fiddling around with pandas or R data tables.
Maybe it isn't what you built, but I've said many times this is the most useful data science product; the one that is 10x easier than writing your own every damn time. Fancy woo-woo classifiers with alleged superpowers don't even begin to compare to tooling like this.
Interesting. I'd love to know close or far it is from what you are looking for.
Currently it can input XLS/XLSX (on Windows) and delimited text (e.g. CSV) and can output delimited text. I am looking to support other inputs and outputs, e.g. JSON / ODBC / SQLite, depending on feedback.
well, I don't work on osx or windows, like, ever. ;-)
csv is the lingua franca for prototyping of course. For deploying (which could make your code more sticky in an enterprise setup), you need to hook up to real databases.
I'd suggest finding a local senior data scientist or consulting group and use them to drive your product feature development. I'm not sure who you had in mind for your end user, but I do think data science types (the ones who get paid; not smurfs who want a DS job some day) are a viable market for something like this.
Marketing people for a start. They often have vast reams of data (email lists, postal lists, analytics data) that they need to clean, reformat and analyze.
I love this kind of programming tools, but do not understand the terminology. Using programming tools has always been called, eeer, "programming"? Is there something that I'm missing here? What's the point of saying "no programming" when you are, precisely, programming?
It is aimed at professionals who have data to transform, but aren't programmers or data science professionals.
Use cases include:
* making a list of all the people in mailing list A that are not in mailing list B
* filtering a log file
* joining two spreadsheets
* renaming, reordering and adding/deleting columns in a table
* reformatting dates
* de-duplicating a postal mailing list
It is desktop software for Windows and Mac, so there is no latency and you don't have to upload sensitive data to a third party server.
At some point we plan to start charging. But the current beta is free until the end of November. And there may be another free beta after that.
We would love to get some feedback. Particularly from people using it to solve real world problems.