Hacker News new | past | comments | ask | show | jobs | submit | Ciantic's comments login

It is nice to see more ORMs, but inventing a new file format and language `toasty` isn't my cup of tea. I'd rather define the models in Rust and let the generator emit more Rust files.

Creating your own file format is always difficult. Now, you have to come up with syntax highlighting, refactoring support, go to definition, etc. When I prototype, I tend to rename a lot of my columns and move them around. That is when robust refactoring support, which the language's own LSP already provides, is beneficial, and this approach throws them all away.


My experience with Prisma, which has a very similar DSL for defining schemas, has changed my mind on this. Makes me much more productive when maintaining large schemas. I can make a one line change in the schema file and instantly have types, models, and up/down migrations generated and applied, and can be guaranteed correct. No issues with schema drift between different environments or type differences in my code vs db.

Prisma is popular enough it also has LSP and syntax highlighting widely available. For simple DSL this is actually very easy build. Excited to have something similar in Rust ecosystem.


I mostly agree with this, but the trouble is (probably) that proc-macros are heavy-handed, inflexible, and not great for compile times.

In this case, for example, it looks like the generated code needs global knowledge of related ORM types in the data model, and that just isn't supported by proc-macros. You could push some of that into the trait system, but it would be complex to the point where a custom DSL starts to look appealing.

Proc-macros also cannot be run "offline", i.e. you can't commit their output to version control. They run every time the compiler runs, slowing down `cargo check` and rust-analyzer.


You can absolutely do global knowledge in proc macros via the filesystem and commit their output to version control: https://github.com/trevyn/turbosql

Roc language can get away with using only Result because of its novel error handling. It will "infer" the error types for functions. Normally having a lot of error types is kind of cumbersome to handle, but inferring error types makes it easy to add safe error types, then why not replace all Options, etc with a single type as well?

If you are interested more there is a good YouTube from five months ago by Richard Feldman: https://www.youtube.com/watch?v=7R204VUlzGc error handling starts from 28 minutes, at 42 minutes you can see the inferred error types, which is very neat.


> It would be great if the SQLite team published an official npm package bundling the WASM version, could be a neat distribution mechanism for them.

I think they've been doing that for a while, in JS script you can already do this:

    import sqlite3InitModule from "https://cdn.jsdelivr.net/npm/@sqlite.org/sqlite-wasm/sqlite-wasm/jswasm/sqlite3-bundler-friendly.mjs";

    const sqlite3 = await sqlite3InitModule({
        locateFile(file: string) {
            return "https://cdn.jsdelivr.net/npm/@sqlite.org/sqlite-wasm/sqlite-wasm/jswasm/sqlite3.wasm";
        },
    });

    // SQLite's C API
    const capi = sqlite3.capi;
    console.log("sqlite3 version", capi.sqlite3_libversion(), capi.sqlite3_sourceid());

    // OO API example below oo1 docs https://sqlite.org/wasm/doc/tip/api-oo1.md
    const oo = sqlite3.oo1;

    const db = new oo.DB();
    const createPersonTableSql = `
    CREATE TABLE IF NOT EXISTS person (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        name TEXT NOT NULL,
        age INTEGER NOT NULL
    );
    `;
    db.exec([createPersonTableSql]);
It works in regular old script tag with type=module, or Deno. I have example HTML here:

https://github.com/Ciantic/experimenting-sqlite-wasm/blob/ma...


> > It would be great if the SQLite team published an official npm package

> I think they've been doing that for a while,

Kinda: <https://sqlite.org/wasm/doc/trunk/npm.md>

We in the sqlite project neither use nor require npm in any capacity whatsoever, so it would be kinda silly for us to attempt to support it. We instead leave that level of code/tools to folks who use and/or care about them.

There _is_ an "officially sanctioned/blessed" npm repo, and we actively support its maintainer (e.g. we participate the issue tracker and make patches in the core distribution where they're strictly needed), but we otherwise keep a "hands off" policy when it comes to non-standardized APIs and toolchains.

We _like_ to see people to plug the sources into their tools of choice, but we cannot feasibly take on the burden of doing that plugging-in for them, especially given how fluid the JavaScript ecosystem is when it comes to frameworks and tools.

Sidebar: we rely heavily on Emscripten because there is, for all practical purposes, it has no substitute, but we also actively go out of our way to ensure that the sources can be easily plugged in to an alternative should one ever appear.


Yes, my experience as well. That's why having an alternate small display (8" or smaller) for your laptop/desktop is better. You can then drag any application on it and use it as a timer or busy sign or whatever. It's way more convenient than an extra device you have to reach and fiddle with.

Incidentally Bootstrap 5.3 seems to have the same problem as the article describes. There is a gap which doesn't do anything if clicked, right between the radio button and the label.


The utility of tracing is great, I've been using Azure Application Insights with NodeJS (and of course in .NET). This is relatively simple because it monkey patches itself everywhere if you go through the "classic" SDK route. Then adding your own data to logs is just a few simple functions trackTrace, trackException, trackEvent, etc.

However, if you want to figure out how it works you might be scared, it is not lightweight. I just spent a few days digging through the Azure Application Insights NodeJS code base which integrates with OpenTelemetry packages. It's an utter mess, a huge amount of abstractions. Adding it to the project brought 100 MB and around 40 extra packages.


This isn't just a problem in JS.

In every language I looked the otel libraries were a bloated, over-abstracted and resource-hungry mess.

I think that's partially because it is actually difficult and complex to implement, and partially because the implementations are often written by devs without a long history of implementing similar systems.


It's been a bit since I've added it to an existing project, but at least as of a year or so ago, the Rust implementation (tracing + tracing-opentelemetry + opentelemetry-jaeger specifically for that project) was similar.

The impact on compile time and code size wasn't bad (for a project that was large and already pulling in a lot of crates), but it had a huge runtime cost - mostly allocator pressure in the form of temp allocations from what I could see. For a mostly I/O bound workload, it more than doubled CPU utilization and ballooned OS-measured memory consumption by >30%


The OpenTelemetry spec is a mess. There's so much … abstract blah blah blah? … and very little actual details.

If I actually go to the part of the spec that I think gets down to "here is how to concretely write OpenTelemetry stuff [1], that seems to have the various attributes camelCased, for example, whereas the article has named them "spanID" and "traceID".

AFAICT the "spec" also just links you to the implementation. "Just" read this protobuf definition, translate that to JSON in your mind's eye. I "POST" this to a hard-coded path tacked onto a URL… but do I post individual traces/logs? Can I batch them? I'm sure there's a gRPC thing I could start guessing from…

But it seems like the JSON stuff is a second class citizen from the gRPC interface. Unless that's just as bad, too…

Actually getting set up in Python isn't too terrible, though there are a few classes that you're like "what's the point of this?" and most of them are apparently just undoc'd. (E.g., [2], ^F TraceProvider, get nothing.)

It is a bit depressing how this seems to be becoming The Chosen Spec.

I also sort of have 64-bit integers for span IDs (TFA never mentions it, but AFAICT this is required by spec). I'd much rather have "/span/ids/are/a/tree" span IDs, as this integrates much better with any logging system: I can easily ask my log viewer to filter to a specific span (span_id == "/spans/a/b/c") or to a subtree (span_id regex-matches /^\/spans\/a\/.*/)

(And the spec bizarrely focuses on some sort of language-abstract API, instead of … actual data types / media types?)

[1]: https://opentelemetry.io/docs/specs/otlp/#otlphttp

[2]: https://opentelemetry-python.readthedocs.io/en/latest/api/tr...


The .net implementation is about as clean as it can get, but a lot of that has to do with Microsoft caring very deeply about this kind of performance data for a very long time (thus having the entire System.Diagnostics namespace).

There’s certainly some abstraction that is gratuitous still, but it’s better than most of the architect astronaut code I’ve seen targeting the CLR.


Yes, this is exactly my impression too.. the code for opentelemetry-js is over engineered and adds a scary amount of dependency code. There are quite a bunch of libraries which I'm not sure what they do and in which scenarios I might need them. The documentation is not very helpful either. I look forward to someone implementing a opentelemetry-nano package with only the minimum stuff needed and allow me to choose extra support for my dependencies or an easy way of adding my own wrappers.


Also badly documented. If you try to implement something non-standard with it, good luck. I once needed to write code where trace started in node an continued inside a node api native library. Getting these two traces to connect must be one of the most frustrating things I've worked on.

At least on the Rust side you have types to help you out, but it is still quite complex and the crates have bugs open for years, impossible to solve with the current architecture.


I had a lot of fun wading through that mess in the past trying to determine why something wasn't working. A fun fact that I just learned is that the node sdk is now just a shim over https://www.npmjs.com/package/@azure/monitor-opentelemetry. It seems like the future is just using that package directly which hopefully improves the situation. One benefit is you can extend it with OTel instrumentation packages.


What’s your plans for applications insights sunsetting?


https://azure.microsoft.com/en-us/updates/we-re-retiring-cla...

Do you have a link for what you are speaking of?


There’s no public announcement yet but from what reps say to customers and what people working on azure say is app insights is more or less being wound down in favor of building out open source solutions because it’s more favourable and less maintaince / dev than building out their own solution. Think more OTEL/Grafana. Basically word on the inside is MS doesn’t want to pay to build out app insights.


Yes, I loved MIP models too. I'm really annoyed that the last Gramin Forerunner watch with an MIP display was 955, and it's now discontinued and was recently removed from Garmin's website. All their new models have OLED, and I kind of understand as it looks better indoors but wastes a huge amount of battery when used outside.

I would have preferred they investigate some of these newer LCD screens that can work reflectively and optionally with backlights on.


I'm a long distance runner. So my requirements are slightly different than the average person. But that said. It seems to me Garmin's strategy is clear, the Forerunner and even the Fenix lines are going to be OLED. Where as the Enduro is the MIP line.

Sure there's a MIP Fenix 8, but I feel like that might be something that eventually goes as more people who are newer to sports watches, the people transitioning over from Apple or Google watches.. those people who see 7 days battery life and think "wow" where as we look at the GPS always on time and think "more please".

The absolute pick of this generation is the Enduro 3 now. It's cheaper, lasts waaaaay longer, and does everything we want. Fenix 8? Dive computer, and the ability to take calls? No thanks. I just want more battery life, and better solar thanks.


I have no idea how Garmin markets outdoor activity watches with OLED screens.


Yep, I had a cheaper MIP garmin watch that I was very happy with until it spontaneously bricked itself one day. It was just barely in warranty, and they replaced it, but refused to give me an equivalent replacement and instead sent the newer OLED model in the same lineup. It's... fine, but the battery life is abysmal with the always-on display and just OK without.


Yeah my forerunner 955 MIP was the sweet spot of size and battery life. Sadly the battery lifespan is fairly short on those too.

I went for the Instinct 2 next purely to avoid the OLED infestation.


Here is one example from the PDF:

    FROM r JOIN s USING (id)
    |> WHERE r.c < 15
    |> AGGREGATE sum(r.e) AS s GROUP BY r.d
    |> WHERE s > 3
    |> ORDER BY d
    |> SELECT d, s, rank() OVER (order by d)
Can we call this SQL anymore after this? This re-ordering of things has been done by others too, like PRQL, but they didn't call it SQL. I do think it makes things more readable.


The point of SQL pipe syntax is that there is no reordering. You read the query as a sequence of operations, and that's exactly how it's executed. (Semantically. Of course, the query engine is free to optimize the execution plan as long as the semantics are preserved.)

The pipe operator is a semantic execution barrier:everything before the `|>` is assumed to have executed and returned a table before what follows begins:

From the paper:

> Each pipe operator is a unary relational operation that takes one table as input and produces one table as output.

Vanilla SQL is actually more complex in this respect because you have, for example, at least 3 different keywords for filtering (WHERE, HAVING, QUALIFY) and everyone who reads your query needs to understand what each keyword implies regarding execution scheduling. (WHERE is before grouping, HAVING is after aggregates, and QUALIFY is after analytic window functions.)


Golly, QUALIFY, a new SQL operator I didn’t know existed. I tend not to do much with window functions and I would have reached for a CTE instead but it’s always nice to be humbled by finding something new in a language you thought you knew well.


Is not common at all, is a non ANSI SQL clause that afaik was created by Teradata, syntactic sugar for filtering using window functions directly without CTEs or temp tables, especially useful for dedup. In most cases at least, for example you can't do a QUALIFY in an query that is aggregating data just as you can't use a window function when aggregating.

Other engines that implement it are direct competitors in that space: Snowflake, Databricks SQL, BigQuery, Clickhouse, and duckdb (only OSS implementation I now). Point is: if you want to compete with Teradata and be a possible migration target, you want to implement QUALIFY.

Anecdote: I went from a company that had Teradata to another where I had to implement all the data stack in GCP. I shed tears of joy when I knew BQ also had QUALIFY. And the intent was clear, as they also offered various Teradata migration services.


> The point of SQL pipe syntax is that there is no reordering.

But this thing resembles other FROM-clause-first variants of SQL, thus GP's point about this being just a reordering. GP is right: the FROM clause gets re-ordered to be first, so it's a reordering.


> The pipe operator is a semantic execution barrier:everything before the `|>` is assumed to have executed and returned a table before what follows begins

I already think about SQL like this (as operation on lists/sets), however thinking of it like that, and having previous operations feed into the next, which is conceptually nice, seems to make it hard to do, and think about:

> *(the query engine is free to optimize the execution plan as long as the semantics are preserved)

since logically each part between the pipes doesn't know about the others, so global optimizations, such as use of indexes to restrict the result of a join based on the where clause can't be done/is more difficult.


This kind of implies there's better or worse ordering. AFAIK that's pretty subjective. If the idea was to expose how the DB is ordering things, or even make things easier for autocomplete OK, but this just feels like a "I have a personal aesthetic problem with SQL and I think we should spend thousands of engineering hours and bifurcate SQL projects forever to fix it" kind of thing.


> Vanilla SQL [...] QUALIFY is after analytic window functions

Isn't that FILTER (WHERE), as in SELECT avg(...) FILTER (WHERE ...) FROM ...?


This is an interesting point.

All these years I've been doing that reordering and didn't even realize!


> The point of SQL pipe syntax is that there is no reordering.

If you're referring to this in the comment you're replying to:

> Can we call this SQL anymore after this? This re-ordering of things ...

Then they're clearly just saying that this is a reordering compared to SQL, which is undeniably true (and the while point).


The post I was referring to said that this new pipe syntax was a big reordering compared to the vanilla syntax, which it is. But my point is that if you're going to understand the vanilla syntax, you already have to do this reordering in your head because the order in which the the vanilla syntax executes (inside out) is the order in which pipes syntax reads. So it's just easier all around to adopt the pipe syntax so that reading and execution are the same.


'qualify' is now standard? Thought it was a vendor extension currently.


This is an extension on top of all existing SQL. The pipe functions more or less as a unix pipe. There is no reordering, but the user selects the order. The core syntax is simply:

  query | operator
Which results in a new query that can be piped again. So e.g. this would be valid too:

  SELECT id,a,b FROM  table WHERE id>1
  |WHERE id < 10
Personally, I can see this fix so much SQL pain.


okay, now I can see why this so much reminds of CTE


The multiple uses of WHERE with different meanings is problematic for me. The second WHERE, filtering an aggregate, would be HAVING in standard SQL.

Not sure if this is an attempt to simplify things or an oversight, but favoring convenience (no need to remember multiple keywords) over explicitness (but the keywords have different meanings) tends to cause problems, in my observation.


In the query plan, filtering before or after an aggregation is the same, so it's a strange quirk that SQL requires a different word.


I was not there at the original design decisions of the language, but I imagine it was there specifically to help the person writing/editing the query easily recognize and interpret filtering before or after an aggregation. The explicitness makes debugging a query much easier and ensures it fails earlier. I don't see much reason to stop distinguishing one use case from the other, I'm not sure how that helps anything.


I also wasn't there, but I think this actually wasn't to help authors and instead was a workaround for the warts of SQL. It's a pain to write

    SELECT * FROM (SELECT * FROM ... GROUP BY ...) t WHERE ...
and they decided this was common enough that they would introduce a HAVING clause for this case

    SELECT * FROM ... GROUP BY ... HAVING ...
But the real issue is that in order to make operations in certain orders, SQL requires you to use subselects, which require restating a projection for no reason and a lot of syntactical ceremony. E.g. you must give the FROM item a name (t), but it's not required for disambiguation.

Another common case is projecting before the filter. E.g. you want to reuse a complicated expression in the SELECT and WHERE clauses. Standard SQL requires you to repeat it or use a subselect since the WHERE clause is evaluated first.


I think this stems from the non-linear approach to reading a SQL statement. If it were top-to-bottom linear, like PRQL, then the distinction does not seem merited. It would then always be filtering from what you have collected up to this line.


I think the original sin here is not making aggregation an explicitly separate thing, even though it should be. Adding a count(*) fundamentally changes what the query does, and what it returns, and what restrictions apply.


Indeed. Just as I think git’s N different ways to refer to the same operation was a blunder.


But pre- and post- aggregation filtering is not really "the same" operation.


If I use a CTE and filter the aggregate, feels the same to me.


If you perform an aggregation query in a CTE, then filter on that in a subsequent query, that is different, because you have also added another SELECT and FROM. You would use WHERE in that case whether using a CTE or just an outer query on an inner subquery. HAVING is different from WHERE because it filters after the aggregation, without requiring a separate query with an extra SELECT.


> HAVING is different from WHERE because it filters after the aggregation, without requiring a separate query with an extra SELECT.

Personally I rarely use HAVING and instead use WHERE with subqueries for the following reasons:

1-I don't like repeating/duplicating a bunch of complex calcs, easier to just do WHERE in outer query on result

2-I typically have outer queries anyway for multiple reasons: break logic into reasonable chunks for humans, also for join+performance reasons (to give the optimizer a better chance at not getting confused)


The main (only?) task I routinely use HAVING for is finding duplicates.


You can always turn a HAVING in SQL into a WHERE by wrapping the SELECT that has the GROUP BY in another SELECT that has the WHERE that would have been the HAVING if you hadn't bothered.

You don't need a |> operator to make this possible. Your point is that there is a reason that SQL didn't just allow two WHERE clauses, one before and one after GROUP BY: to make it clearer syntactically.

Whereas the sort of proposal made by TFA is that if you think of the query as a sequence of steps to execute then you don't need the WHERE vs. HAVING clue because you can see whether a WHERE comes before or after GROUP BY in some query.

But the whole point of SQL is to _not have to_ think of how the query is to be implemented. Which I think brings us back to: it's better to have HAVING. But it's true also that it's better to allow arbitrary ordering of some clauses: there is no reason that FROM/JOIN, SELECT, ORDER BY / LIMIT have to be in the order that they are -- only WHERE vs. GROUP BY ordering matters, and _only_ if you insist on using WHERE for pre- and post-GROUP BY, but if you don't then all clauses can come in any order you like (though all table sources should come together, IMO).

So all in all I agree with you: keep HAVING.


> The second WHERE, filtering an aggregate, would be HAVING in standard SQL.

Only if you aren't using a subquery otherwise you would use WHERE even in plain SQL. Since the pipe operator is effectively creating subqueries the syntax is perfectly consistent with SQL.


Perhaps, however then you eliminate the use of WHERE/HAVING sum(r.e) > 3, so in case you forgot what the alias s means, you have to figure that part out before proceeding. Maybe I'm just used to the existing style but as stated earlier, seems this is reducing explicitness which IMO tends to lead to more bugs.


A lot of SQL engines don't support aliases in the HAVING clause and that can require duplication of potentially complex expressions which I find very bug-inducing. Removing duplication and using proper naming I think would be much better.

I will already use subqueries to avoid issues with HAVING.


> A lot of SQL engines don't support aliases in the HAVING clause

We're moving from SQLAnywhere to MSSQL, and boy, we're adding 2-5 levels of subqueries to most non-trivial queries due to issues like that. Super annoying.

I had one which went from 2 levels deep to 9... not pleasant. CTEs had some issues so couldn't use those either.


I'm surprised you had issues with CTEs -- MS SQL has one of the better CTE implementations. But I could see how it might take more than just trivial transformations to make efficient use of them.


I don't recall all off the top of my head.

One issue, that I mentioned in a different comment, is that we have a lot of queries which are used transparently as sub-queries at runtime to get count first, in order to limit rows fetched. The code doing the "transparent" wrapping doesn't have a full SQL parser, so can't hoist the CTEs out.

One performance issue I do recall was that a lateral join of a CTE was much, much slower than just doing 5-6 sub-queries of the same table, selecting different columns or aggregates for each. Think selecting sum packages, sum net weight, sum gross weight, sum value for all items on an invoice.

There were other issues using plain joins, but I can't recall them right now.


CTE's (at least in MS SQL land) are a syntax level operation, meaning CTE's get expanded to be as if you wrote the same subquery at each place a CTE was, which frequently impacts the optimizer and performance.

I like the idea of CTE's, but I typically use temp tables instead of CTE's to avoid optimizer issues.


If you use temp tables you're subverting the optimizer. Sometimes that's what you want but often it's not.


I use them on purpose to "help" the optimizer by reducing the search space for query plan ((knowing that query plan optimization is a combinatorial problem and the optimizer frequently can't evaluate enough plans in a reasonable amount of time).


Can you please share the SQL queries? If tables/columns are sensitive, maybe it can be anonymized replacing tables with t1,t2,t3 and columns c1,c2,c3.


Should we introduce a SUBSELECT keyword to distinguish between a top-level select and a subquery?

To me that feels as redundant as having WHERE vs HAVING, i.e. they do the same things, but at different points in the execution plan. It feels weird to need two separate keywords for that.


The proposal here adds pipe syntax to SQL.

So it would be reasonable to call it SQL, if it gets traction. You want to see some of the big dogs adopting it.

That should at least be possible since it looks like it could be added to an existing implementation without significant disruption/extra complexity.


There may be trademark issues, but even if not, doing sufficient violence to the original thing argues for using a new name for the new thing.


Yes, having |> isn't breaking SQL but rather enhancing it.

I really like this idea of piping SQL queries rather than trying to create the perfect syntax from the get go.

+1 for readability too.


Honestly, it seems like a band-aid on legacy query language.


SQL a legacy query language?

In order for a thing to be considered legacy, there needs to be a widespread successor available.

SQL might have been invented in the 70s but it's still going strong as no real alternative has been widely adopted so far - I'd wager that you will find SQL at most software companies today.

Calling it legacy is not realistic IMO.


I mean kinda? It's legacy in the "we would never invent this as the solution to the problem domain that's today asked of it."

We would invent the underlying engines for sure but not the language on top of it. It doesn't map at all to how it's actually used by programmers. SQL is the JS to WebAssembly, being able to write the query plan directly via whatever language or mechanism you prefer would be goated.

It has to be my biggest pain point dealing with SQL, having to hint to the optimizer or write meta-SQL to get it to generate the query plan I already know I want dammit! is unbelievably frustrating.


By that definition JavaScript is also legacy.

> having to hint to the optimizer or write meta-SQL to get it to generate the query plan I already know I want dammit'

That's not in the domain of SQL. If you're not getting the most optimized query plan, there is something wrong with the DBMS engine or statistics -- SQL, the language, isn't supposed to care about those details.


> That's not in the domain of SQL.

That's my point, I think we've reached the point where SQL the langage can be more of a hindrance than help because in a lot of cases we're writing directly to the engine but with oven mitts on. If I could build the query from the tree with scan, filter, index scan, cond, merge join as my primitives it would be so nice.


Sounds like you don’t want SQL at all. Some sort of non-SQL, or not-SQL, never-SQL. Something along those lines.


That's the thing though, I still want my data to be relational so NoSQL databases don't fit the bill. I want to interact with a relational database via something other than the SQL language and given that this language already exists (Postgres compiles your SQL into an IR that uses these primitives) I don't think it's a crazy ask.


> It's legacy in the "we would never invent this as the solution to the problem domain that's today asked of it."

I don't think that definition of legacy is useful because so many things which hardly anyone calls "legacy" fit the definition - for example: Javascript as the web standard, cars in cities and bipartisan democracy.

I think many of us would say that that none of these is an ideal solution for the problem being solved, but it's what we are stuck with and I cannot think anyone could call it "legacy systems" until a viable successor is widespread.


Not bad, very similar to dplyr syntax. Personally i’m too used to classic SQL though and this would be more readable as CTEs. In particular how would this syntax fair if it was much more complicated with with 4-5 tables and joins?


IMO having SELECT before FROM is one of SQL's biggest mistakes. I would gladly welcome a new syntax that rectifies this. (Also https://duckdb.org/2022/05/04/friendlier-sql.html)

I don't love the multiple WHEREs.


Duckdb also supports prql with an extension https://github.com/ywelsch/duckdb-prql


SQL was supposed to follow English grammar. Having FROM before SELECT is like having “Begun” before “these clone wars have.”


That's a great list of friendlier sql in DuckDB. For most of that list I either run into it regularly or have wanted the exact fix they have.


duckDB is what sql should be in 2024

https://duckdbsnippets.com/


The very first example on that page is vulnerable to injection.


Which one?


  #!/bin/bash 
  function csv_to_parquet() {     
      file_path="$1"     
      duckdb -c "COPY (SELECT * FROM read_csv_auto('$file_path')) TO '${file_path%.*}.parquet' (FORMAT PARQUET);" }


Eh, in the context of the site and other snippets that seems pedantic.

Could it be run on untrusted user input? Sure. Does it actually pose a threat? It's improbable.


> Can we call this SQL anymore after this?

Maybe not, just as we don't call "rank() OVER" SQL. We call it SQL:2003. Seems we're calling this GoogleSQL. But perhaps, in both cases, we can use SQL for short?


You show a good example. Many people would call that SQL, and if pipes become popular, they too might simply be called SQL one day.


> GoogleSQL

EssGyooGell: A Modest Proposal


this is consistent, non-pseudo-english, reusable, and generic. The SQL standard largely defines the aesthetic of the language, and is in complete opposition to these qualities. I think would be fundamentally incorrect to call it SQL

Perhaps if they used a keyword PIPE and used a separate grammar definition for the expressions that follow the pipe, such that it is almost what you’d expect but randomly missing things or changes up some keywords


Yes we can call it sql.

Language syntax changes all the time. Their point is that sql syntax is a mess and can be cleaned up.


In that example, "s" has two meanings: 1. A table being joined. 2. A column being summed.

For clarity, they should have assigned #2 to a different variable letter.


Honestly SQL screwed things up from the very beginning. "SELECT FROM" makes no sense at all. The projection being before the selection is dumb as hell. This is why we can’t get proper tooling for writing SQL, even autocompletion can’t work sanely. You write "SELECT", what’s it gonna autocomplete?

PRQL gives me hope that we might finally get something nice some day


The initial version of SQL was called "Structured English Query Language".

If the designers intended to create a query language that resembled an English sentence, it makes sense why they chose "SELECT FROM".

"Select the jar from the shelf" vs. "From the shelf, select the jar".


“Go to the shelf and select the jar”. You’re describing a process, so it’s natural to formulate it in chronological order.


SQL is a declarative language not a procedural one. You tell the query planner what you want, not how to do it.


    SELECT 1+2;
FROM clauses aren't required, and using multiple tables in FROM doesn't seem to work out too well when that syntax is listed first.


Doesn’t change anything, you can still have the select at the end, and optional from and joins at the beginning. In your example, the select could be at the end, it’s just that there’s nothing before.


Beginning with Oracle Database Release 23 [released May 2, 2024], it is now optional to select expressions using the FROM DUAL clause.


WITH clauses are optional and appear before SELECT. No reason why the FROM clause couldn't behave the same


Isn't that strictly for CTEs? In which case, you are SELECTing from the CTE.


I also hate having SELECT before FROM because I want to think of the query as a transformation that can be read from top to bottom to understand the flow.

But I assume that that’s part of why they didn’t set it up that way — it’s just a little thing to make the query feel more declarative and less imperative


> what’s it gonna autocomplete?

otoh if you selected something the from clause and potentially some joins could autocomplete


Not reliably, especially if you alias tables. Realistically, you need to know what you’re selecting from before knowing what you’re selecting.


At this point I think that vanilla SQL should just support optionally putting the from before the select. It's useful for enabling autocompletion, among other things.


And a simple keyword that does a GROUP BY on all columns in select that aren't aggregates, just a syntax level macro-ish type of thing.


My initial reaction is that the pipes are redundant (syntactic vinegar). Syntactic order is sufficient.

The changes to my SQL grammar to accomodate this proposal are minor. Move the 'from' rule to the front. Add a star '*' around a new filters rule (eg zero-or-more, in any order), removing the misc dialect specific alts, simplifying my grammar a bit.

Drop the pipes and this would be terrific.


In the example would there a difference between `|> where s > 3` and `|> having s > 3` ?

Edit: nope, just that you don't need having to exist with the pipe syntax.


Looking at this reminds me of Apache Pig. That’s not a compliment.


We can call it "Linq2SQL" and what a disaster it was...


What's the SQL equivalent of this?


I started to listen to Bill Gates' interview [1], just to hear what he had in mind back then. Sounded almost topical in today's world. AI was mentioned, and predicting users' input in the distant future.

Side note, archive.org has two players. The first one doesn't have a timestamp where you currently are. The second player, the Winamp clone does have it, but I don't think one can link to specific parts.

[1] The Bill Gates interview starts at 10:10 https://archive.org/details/the-famous-computer-cafe-1984-11...


> The second player, the Winamp clone

Wait, what the Winamp clone?

Clicks link

This is beautiful, thank you for this!


A standalone project, incidentally: https://webamp.org/


That's great. Can't see a keep screen on option?


I wanted to say that sounds like too much to make available to web apps at all, but nope, apparently it’s indeed a thing that a website can tell your computer to do[1]. I don’t see an option to do that here either, though. I guess your choices are to submit a pull request[2] adding that capability or to use a manual systemwide toggle (I use Keep Awake![3] on GNOME Shell and Coffee[4] on Android).

[1] https://developer.mozilla.org/en-US/docs/Web/API/Screen_Wake...

[2] https://github.com/captbaritone/webamp

[3] https://extensions.gnome.org/extension/1097/keep-awake/

[4] https://f-droid.org/en/packages/com.github.muellerma.coffee/


Just a few lines of script and you are done. I think it was a reaction to sites playing a hidden video, which was just a waste. Even works on my Ubuntu laptop.


Wow!

And you can move and resize the widgets! https://i.imgur.com/PmXmpVO.png

I wonder if we can get an MPV skin of Winamp

https://news.ycombinator.com/item?id=41277014

This is a great interview with Gates. The interviewer is great too, great commentary and questions.

"A machine on every desktop ad a machine in every home - and one of the things that will enable us to do that is graphics"

(and he mentions how great the Macintosh was doing in the graphics area)

---

Crazy the things that have been going on with him lately - and what was said by Thiel about him on JRE, yet HN seems to want to not discuss any of it...

(This 1984 interview with gates deserves its own HN post. The commercials on it are great as well. And the fact that the interviewer brought up Aritficial Intelligence is great - and Gates' response was very cogent of the state of AI and the path forward. Where he says "people worry about AI taking over" - and says when "we can make software fully soft, we can get machines to help us"

Great piece of history, that.


If you click the little icon in the upper-left corner of the UI, you can change 'skins' as well. Very cool


I can't let go of Google Chrome fully, so I'm looking for alternatives once uBlock origin ceases to function shortly. I've yet to test this, but this looks nice. However it doesn't have a big community around it, and I'm wondering if there is a similar tool but with a community around it as well.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: