I wondered why in the heck a static web site running in a resource limited environment would even attempt to run https. Based on the HN hug of death, it appears the reverse proxy adds HTTPS/TLS/whatever support, and it wasn't able to cope.
I wish that the demand for security theater in web browsers wasn't so high that they effectively prohibit plain old text transport.
> I wondered why in the heck a static web site running in a resource limited environment would even attempt to run https.
Err, I don't think it is. This is in the response headers:
Server: nginx/1.18.0 (Ubuntu)
Getting nginx (let alone Ubuntu) running on a ESP32 would be a seriously impressive achievement.
The web site also says:
Once it is running, you can access your website by entering your ESP32's IP address in a web browser. For example, if your ESP32's IP address is 192.168.1.100, input http://192.168.1.100 in your browser.
The site works for me now, and I'd suspect it to be able to support a minimal HTTP (not HTTPS) server if it ran natively, but then we also have this:
You need to download micropython.py file and place it in the root directory of esp32.
So it's written in Python and a chip with the compute power of a potato. It would be interesting to compare it to the same thing done in C or Rust.
While any serious game development was done in Assembly, languages like Python were a plenty to chose from.
So instead of looking down on the project for having chosen Python, as someone that used to play Defender of the Crown at the high school computing club, I marvel to what an ESP32 is capable of.
yes, you got it right its due to that odd message web browsers show for non-https websites. This HTTPS/TLS has been added using nginx on different server, also to support some load. The website is still under lot of traffic, however nginx is doing good on its part.
So here is nginx running on a real computer and a homebrew webserver running on a potato, and your theory is nginx is the limiting factor. And it’s all the fault of encryption and the browser cartel.
> Not everything needs to be encrypted. If I'm serving static webpages the only thing I might want to log is the which IPs visited at what time of day.
As a frequent user of public WiFi (mostly at coffee shops, airports, etc.), I prefer that every page is encrypted so that nobody can MITM me/tamper with what I see in my browser, even on plain text pages.
If you are frequently using networks you suspect to be hostile, wouldn't you L2VPN your traffic back to a trusted exit point regardless? HTTP/HTTPS is likely only part of the information your computer is providing to said hostile networks. Worrying about the encryption of plain text pages seems to be like worrying about a stain on the couch whilst the house is on fire.
* Is using HTTPS enough on an insecure network? Should one also be using a VPN?
* Would end-users see a benefit from HTTPS on simple/plaintext sites?
> HTTP/HTTPS is likely only part of the information your computer is providing to said hostile networks.
What other non-encrypted information might a normal person's computer be communicating?
I understand that VPNs do improve privacy. Privacy is moderately important to me, but I don't think it's important enough for me to use a VPN.
There are also occasional vulnerabilities in TLS/SSL/HTTPS but... what can I really do about that? Even a VPN might establish its session with those technologies.
> wouldn't you L2VPN your traffic back to a trusted exit point regardless?
It's reasonable to expect someone technical like myself to do this, and maybe I am really just playing loose with my security. But, nobody outside of the tech community is even thinking about this. 99% of people are happy using whatever WiFi is free and aren't going to question its security.
So, using HTTPS for "simple" sites is still beneficial since you will be making your content more secure for less technical users who might be on insecure networks.
Where does logging enter into it? To my understanding, serving traffic over HTTPS doesn't require you to do any additional logging (or any logging at all).
The point about static webpages would be a potentially good one in a world where ISPs and other Internet middlemen are honest and transparent actors, but this has so far proven not to be the case. I think it's in everyone's interest for your static content to reach my browser without ads or tracking mechanisms being injected by a third party.
What example would you have of an ISP or third-party injecting an ad or tracker within the HTTP response? I've certainly seen the DNS query hijacking and while HTTPS will encrypt the transmission, at the ISP level they already have your DNS query and src/dst IP address. Even with HTTPS based on session data it wouldn't be difficult label Netflix/Youtube traffic patterns.
Do you also have any reference to what exactly the collected data is useful for? I could see an ISP selling traffic data for a zip or area but they would already have that based on your billing address.
At the registrar level: the .tk registrar was (in)famous for injecting both ads and random JS into websites that were hosted on domains registered against it.
At the ISP level: I had a Spanish ISP attempt SSL stripping on me a few weeks ago.
> Do you also have any reference to what exactly the collected data is useful for? I could see an ISP selling traffic data for a zip or area but they would already have that based on your billing address.
The goal is always more (and more precise) data points. Being able to run JS on the same origin as the request is more valuable than just the rough GeoIP data the ISP already has.
> At the registrar level: the .tk registrar was (in)famous for injecting both ads and random JS into websites that were hosted on domains registered against it.
I did a google search for domain hijacking, ad injection, javascript and while it does look like .tk domains had/have this issue it doesn't necessarily point to the registrar. After all they are offering free domain registration which is going to get abused. Its also not surprising when their own website doesn't use HTTPS, however their mission statement isn't about security on the Internet.
> The goal is always more (and more precise) data points. Being able to run JS on the same origin as the request is more valuable than just the rough GeoIP data the ISP already has.
But isn't this what Google, Bing, Amazon, Alibaba, already do when they fingerprint your device? They can't use just an IP addresses due to NAT so they collect unique characteristics to your specific device. My question was more so if advertisers can already get down to the device level when you visit their site, what is the ISP's motivation if their data won't be as unique or specific? or maybe a better question is what organizations would be buying the "less" specific data that an ISP could get from your session data?
Unless I'm misunderstanding, I think that's kind of orthogonal to the question of encrypted transit. Plenty of services that expose only HTTPS don't encrypt at rest (and vice versa).
I agree that not everything needs to be encrypted, but unfortunately a lot of people who browse the web are concerned when the browser complains that something is not secure.
From the browser maker's side, how does a browser know whether something should or should not be secured? They have clearly taken a more aggressive approach to inform users what is going on within the underlying protocol. While I do agree that not everything needs to be encrypted, I also agree that the user should know what is or is not happening under the hood.
If everything that didn't _need_ encryption wasn't, then the use of encryption could be suspicious in itself. For my part I'm glad that we've moved to a mostly working system of encryption most things
The Web PKI is hierarchical, but it isn't particularly centralized (other than Let's Encrypt increasingly eating everyone else's lunch, which is probably a good thing).
But in terms of actual failure points: if you're initiating a connection over HTTPS, then the only way an attacker can MITM you is by convincing a CA to incorrectly issue them a certificate for that domain. That's why Chrome and Safari monitor certificate transparency logs, and why website operations should also generally monitor the logs (to look for evidence of misissuance on their own domains).
I'm sure there is some nuance to what someones static site is serving but someones blog doesn't need to be HTTPS. If they are offering downloads you can provide checksums or verify their data through other sources or contacting them out-of-band.
Anything that needs some form of validation from any site should be verifiable in multiple ways. Just because they have HTTPS doesn't mean the provided information or data is automatically correct.
The attack you are suggesting is not commensurate to the types of blogs and information that _need_ HTTPS.
If you are operating at a level where your personal blog can have all possible transit paths compromised by a third-party such that they are hosting some or all resources that you provide for download, modifying them and producing new checksums then you have bigger problems than a blog that doesn't have HTTPS. You would also at that point consider using someone else's platform that will absorb or actively be motivated to thwart these exact scenarios. Not to say that always works out[1].
Additionally your concern of checksums being compromised can easily be thwarted by hosting packages on github, gitlab, bitbucket, pastebin, or a google groups mailing list. All of which still don't require your blog to have HTTPS. You don't have to manage getting your own certificate, paying for yearly renewals or setup any auto-90-day let's encrypt auto-bot.
Great grandma's cookbook recipes on a blog don't need HTTPS.
Maybe it's more than security theater. With mandatory TLS, i.e., encryption plus third party-mediated authentication, the ability to publish a website comes under the control of third parties, so-called "Certificate Authorities" (CAs).
The source of this third party "authority" is unclear. If a CA uses DNS for verification, then the "authority" is ultimately ICANN. And we know that ICANN's "authority" is completely faked up. It has no legal basis. These pseudo regulatory bodies have no democratic process to ensure they represent www users paying for internet subscriptions. As it happens, these organisations generally answer only to so-called "tech" companies.
Effectively, CAs and browser vendors end up gatekeeping who can have a website on the public internet and who cannot. Not to mention who can be an "authority" for what is a trustowrthy website and what is not (CA/Browser Forum).
The hoops that the browser vendors make a www user jump through in order to trust a website without the assistance of a third party are substantial and unreasonable. It seems that no www user can be expected to make their own decisions about what they trust and what they don't. The decision is pre-made, certificates are pre-installed, autonomy is pre-sacrificed, delegated to so-called "tech" companies.
Meanwhile these so-called "tech" companies, who are also the browser vendors, are commercial entities engaged in data collection for online advertising purposes. For more informed users, these actors are perhaps the greatest eavesdropping threat that they face. The largest and most influential of them has been sued for wiretapping www users on multiple occassions.
There are conflict of interest issues all over the place.
tl;dr Even if the contents of the transmission are not sensitive and perfectly suited to plain text, the system put in place by so-called "tech" companies, to benefit themselves at the expense of every www users' privacy, ensures that TLS must be used as a means of identifying what is an "acceptable" website and what is not. Absence of a certificate from a self-appointed, third party certificate authority means "not acceptable". Presence of certificates from first party authorities, i.e., ordinary www users, means "not acceptable".
Let's encrypt is doing god's[1] work to work around the CA scam. And they've made it extremely easy to use. Literally just run one command and you have SSL on your website. May take a few more commands if you're not using one of the more standard HTTP servers.
You have to have a domain to use it obviously. Lucky there are other god's workers like duckdns to work around the domain scam too.
How are they doing SSL certificate management on an ESP32? Their article at https://khalsalabs.com/hosting-a-website-on-esp32-webserver-... makes no mention of how that would work, only really basic code for a static cleartext HTTP server. Is it even capable of such a thing?
Edit: I got a default nginx/1.18.0 (Ubuntu) gateway timeout message after a few minutes trying to load this page, this is reverse proxied.
The software support is incredible IMHO, it's a huge reason to use these chips. I made some toy temperature sensors with an esp32 last year, they make it so easy: https://github.com/jcalvinowens/tempsensor
Very curious about the scaling process. I've been building something on a breadboard with an esp32 and I'm pretty happy with it. Now I want it to be a lot smaller, and in one piece rather than with a bunch of wires and components on a breadboard.
How do you make the step from breadboard dev to something manufacturable?
I didn't do any breadboarding at all, I just jumped off the cliff with this. I started by designing a 1"x1" PCB in EasyEDA with just the MCU and pin headers, and had five manufactured/assembled by JLCPCB to test the core of it. The first time I'd ever touched an ESP32 was when I got those PCBs in the mail and started trying to program them! It was really fun.
Once I'd proved it worked, I pasted that 1"x1" layout into a larger footprint, and added the sensor, power supplies, and batteries. Again, I had no real way to test any of the new stuff: I just iterated until I stopped finding problems to fix, then had them manufactured. A big part of the fun of this has been having to commit to a design without the ability to test: it really makes you think. I also enjoy the exercise of writing as much of the firmware as I can while the hardware is in the mail, then seeing how much actually works when it shows up.
In terms of bad decisions... I used builtin gpio pull-up resistors for I2C: it works, but the margin is very tight, it's just not worth it (and also means I can't put the ESP32 in sleep mode in some cases...). Wifi uses phase to encode information, so having no RF matching will impact its performance beyond the -6dB I mentioned in the README. The inductor/capacitor values are much larger than necessary. The routing of the I2C lines taking a huge bite out of the ground plane under the switcher IC is dubious. Using 1.5V alkaline batteries is nice because I don't have to worry about burning my house down... but I've gone through 200+ AAA batteries over the last year, and it feels very wasteful.
I learned most of what little I know about PCB design from this youtube channel, I can't recommend it enough: https://www.youtube.com/@PhilsLab
Next step is a system integrator like m5stack.com, either build a nice unit from their library of components and let them worry about the minor issues (power regulation etc). If you're prototyping at home just put them in your own enclosure, if you want to go industrial you can 3d print something that integrates with their stuff (eg user-friendly modules like Core) or use the stamp components.
If you have done all the circuitry want to just print/assemble your own PCBs, sites like PCB unlimited will make up short runs or Digikey will handle larger scales.
I usually use https://oshpark.com/ or https://jlcpcb.com/ with EasyEDA or Kicad depending on what you're comfortable with. A good 3D printer wouldn't hurt either.
There was no real goal beyond the experience of building the thing and making it work. I use them to monitor stuff like fridge/freezer and HVAC intake/output, and as leak detectors in my crawlspace.
As you'd probably guess, the fixed cost of the manufacturing was extremely high. Unfortunately I didn't write the numbers down... but going from memory, ordering 5 instead of 30 would have only reduced the total cost by ~20%. I remember a weird valley in cost-per-unit at a quantity of 30: my understanding is that JLC combines small orders, so my guess is that 30 of that board was the largest order they were willing to squeeze onto the same panel as another one.
The error message "504 Gateway Timeout nginx/1.18.0 (Ubuntu)" suggests that Nginx, running on Ubuntu, is acting as a proxy server and is timing out while trying to connect to the backend server. The SSL cert is on the proxy server.
OMG!! I didn't expect that I am getting that much traffic from HN. I put this besides nginx but still its too much traffic to process for Esp32 S2 chip. Lol
Sorry, where's the assumption it's using kubernetes coming from? All I see in the response is nginx which doesn't imply anything k8s. The blog linked from the page doesn't mention k8s either (nor nginx).
That said, you can run an ingress for a service that's just an externalname reference with the right annotations depending on your ingress controller and I'm pretty sure it'd just work.
I think both commenters said those things in jest referring to the oft occurring on HN criticism that everything has to run on k8s nowadays, even a simple website.
I remember that PIC project. I don't know if source was ever released, but I recall a lot of folks being very dubious about the claims made.
Quote:
The PIC has 1024 words (12-bits) of program ROM,
~256 bytes contain a hand-crafted RFC1122-compliant implementation of TCP/IP including.
HTTP/1.0 and i2c eeprom Filesystem, using 3 to 99 instructions.
TCP and UDP protocol stack, using 70 to 99 instructions.
ICMP [supports upto ping -s 11], using upto 14 instructions.
IP - Internet Protocol, v4, using 68 to 77 instructions.
SLIP - Serial Line IP packetisation, about 76 inst
Fully buffered UART, upto 115200 bps, using 38 to 56 instructions.
Operating system: RTOS with Rate Monotonic Analysis, using 3 to 15 instructions.
Well, given that the modern web has a lot more requirements for security to even permit most browsers to view a site, it makes sense that the base hardware needs have increased noticeably.
Exactly those PIC18 devices, still in production and on sale, w/o any changes during the years: http://utronix.se/
Of course, no https, but.. it is not a platform limitation, just an undemanded feature: how would you get a https cert for 192.168.0.1 or a similar intranet address where those device suppose to work? They are just not for cloud datacenters
You can make an HTTPs certificate with that in the SAN section, and it should work fine. You can't get one from a publicly trusted provider, of course, but that's fine; you don't own the IP.
In other words, make your own certificate authority for your own machines. It isn't that hard.
The problems here is not that hardness, and not even yearly certificate updates, or bothering with new certs on every IP address change, but (as the commentator above rightly pointed out)...
1. Planned obsolescence built into HTTPS: no HTTPS-aware server device from year 1999 would work with 2023 browsers. Just because "too old crypto". Plain HTTP works.
Being on a buy side I am against HTTPS in such devices, but I understand the sell side's position.
Everything should be served securely these days. Prior to HTTPS being absolutely king, ISPs here used to inject EXEs with malware and do all sorts of nasty stuff. With HTTPS dominating they don't do that sort of thing anymore as the share of HTTP traffic is so low making ROI very low.
Anyway I'll give you one reason based on the above on why you should serve your content over HTTPS, it shields you from potentially having your visitors be victims of something like this and in all likelihood they will blame you for whatever malware their ISP sent their way... they did get infected from your website, after all.
And further, while edge cases around MitM do exist, the reality is really that it'd almost certainly just fine if someone's personal blog was just http in 99.99% of cases. But most of the web traffic isn't someone's blog and it really should be encrypted, and it's simple enough to set up for free nowadays, so it's going to be far easier to get most of the web to be encrypted if we increasingly work to phase out http.
Yes, small blogs are a 'casualty' of this progression towards expecting HTTPS in that they have to put a tiny bit more work in, but if we didn't do this we'd be back in the days of nitpicking about every single 'acceptable' case of http while vendors use the fact that it doesn't have widespread adoption to leave session cookies in plaintext requests for tools like fire sheep to grab.
People can be really tedious when it comes to this subject. Like, for the authenticity use-case, the server could present its certificate followed by a signed but unencrypted page, in a standard way so the browser could check the signature. Then the signature for static resources can be cached on the server (or middle boxes) and no key exchange or encryption is needed, greatly reducing computational needs to serve a page while still keeping it essentially secure. There's also fewer hops in this scenario (so better user experience), and it's easier to do things like filtering with a simpler proxy without needing to install CAs. But no one wants to have a productive conversation about actual trade-offs here.
Edit: in fact, if we used client certs for user identity[0], signed requests could also be used for form submission for e.g. public forums or youtube uploads where you might not care about privacy of the submission itself.
One of the reasons is to prevent ISP and other to intercept the page and interject the codes before it arrives to the users. It a common method to put the payload in the http. I believe it called middle-in-the-man method. With https, it reduced a lot of attacks.
There was a news about Comcast interjected a Steam storefront page with a data cap warning on it to a Comcast subscriber. And this happened inside Steam app which was using http at the time.
HTTPS doesn't really have to do with whether the page content is sensitive. It's more about protecting visitors from MitM attacks, traffic analysis, and their browser screaming at them and refusing to load your site because it's ""insecure""
I get that. I understand why viewing http is insecure, I dont understand why serving it is insecure.
Apparently this rubs people the wrong way. I get it, run Lets Encrypt and certbot blah blah, but if I am hosting an ESP32 in my house for a hobby project, I running HTTP on the LAN.
> I get that. I understand why viewing http is insecure, I dont understand why serving it is insecure.
Presumably you are serving that content so it can be consumed no? It's not like your consumers can consume https if you only serve http. But yeah I suppose if you are serving read-only content and don't give a shit about what happens client side, there's a lot less reason for https.
Serving data via http is insecure because that data can be intercepted, read and modified.
If it's entirely public data then there's no security risk to the server. The security benefit is for the clients, so unless you hate your users you should use encryption even for totally public static data.
> I understand why viewing http is insecure, I dont understand why serving it is insecure.
People are assuming you want others to be able to see want you are serving. In such case, the server is the only one who can secure the transmission to prevent MITM. The viewer cannot reach over and add in https into the request to prevent their ISP from injecting ads (or other kinds of MITM changes).
You can view pure HTTP website through VPN. It basically encrypted tunnel between you and VPN server through your ISP. So your ISP couldn't try to interject the encrypted connection.
However, your browser might prevent you from connecting to http due to strict https only policy. My browser will stop any connection to http page and throw up a warning.
Having HTTPS as the only option for a site is an excellent default, both for protecting the confidentiality and integrity content, as well as validating the identity of the site for the client. Maybe a good way to put it is that the vast majority of the site's uses and data needs no protection, but protecting all of it well is probably much easier to do correctly than just selectively encrypting the important parts.
My humble little personal site has largely unauthenticated, static blog stuff. It also has personal apps that nobody else uses, but I want to protect the authentication bits.
- If someone is worried they'll be found out using my site, then fine, don't use it. This advice is just for my site, and it's fine to desire security elsewhere and in other contexts.
- If an ISP or MITM want to inject some content in my website, then fine. We'll all know not to use those providers. I promise I'm not important enough for this to be a vector someone would want to exploit.
None of the information I have to offer you requires HTTPS. I assure you.
I think it's fine that https is becoming the default, especially for web services. But we shouldn't enforce it. It's an undue burden to have to support all the certificate machinery just to serve some basic info.
We really need to get back to the basic, easy to hack web. Where it took nothing to spin up services on your home machines and serve them as demos to others. That ethos was great.
Geocities was bought for $3.6 billion dollars by Yahoo in 1999. It lauched in 1994. The web is only three years older than that.
I had my first website on Angelfire in 1996 before my 10th birthday. WhoWhere purchased Angelfire a year later, and then they were bought by Lycos a year after that for $133 million.
Also, I don't remeber it being fantastic. To me, even with all faults considered, things are much nicer today.
That's bullshit when you're accessing my website, where I have some photos of some old science projects and that's it.
A much better middle ground would have been for websites to advertise certain features (login, user accounts) and for browsers to warn when not using SSL. Or to do it based on some heuristic, such as cookie use on a given domain.
The current implementation keeps everyone non-technical from using http, which is a loss for everyone.
Google unilaterally got to make this decision for everyone. Small websites don't matter to their bottom line anymore. They've already scraped and indexed the content, pulled the value away onto walled gardens, and left that web to rot.
I personally think it's pretty mean to hear that a website is hosted on an ESP32 and then post it on HN. Guess we're going to test this one to destruction...
Looks like there's a (non-caching?) nginx reverse proxy in front to do the TLS. I remember trying to do TLS on an ESP8266 and there was a hardcoded limit on the SSL buffer size, limiting the maximum cert chain that could be served. I wonder if there is a similar reason here.
Hosting a website on ESP32 is one thing, but running Python on it?! Are you insane? No wonder it gets hugged to death if more than 2 people visit it at once.
I think it will drop most of the traffic, as I posted in load test results. The traffic coming to this is crazy. I can see nginx access logs and requests are not stopping!!
Why don't you configure the nginx that's in front to cache? I mean, it's still hosted on the device, it's just got a caching proxy in front, like all the big boys do.
Just did it !! Enabling 10m caching on nginx now (It is good atleast now the link will open from HN homepage). Thanks for suggesting this, I should have done it before
Its good idea, I never tried or probably didn't know much about nginx caching. Will Nginx cache webpage and serve directly without hitting much the esp32 server. I am going to read on it now
Even on the much lower end ESP8266 it was usual to use a web browser to let the user configure the application etc. It was fast enough, for one user at a time...
If you look for something similar for the raspberry pico w try phew! from pimoroni:
https://github.com/pimoroni/phew
Also I think microdot will be running on the pico too.
Cool, but is it news? MCUs have been capable of running simple webpages for at least 20 years. What's unusual is exposing it to public internet, where it can be quickly hugged to death.
yes, I have a static IP and port forwarded. It IP can be mapped to website (dns name) via namecheap and godaddy like services. This is enough to put it on internet
In this setup I added a nginx in between (doesn't enable cache yet) for load balancing.
https://web.archive.org/web/20231105185258/https://esp.khals...