It's a very good system for having just enough web analytics for majority of the use cases.
Don't need to give up users privacy and force them to download an intrusive proprietary JavaScript code on their browser to monitor their behaviors and in some cases pass on private information without their knowledge to third party services.
Hopefully we can go back to old days relying more on these kind of tools to just get anonymized statistics.
I haven't researched it yet, but I've been guessing/wondering that relying on logfile stats like these removes GDPR cookie notification requirements for your site. I should probably finalize my understanding of that one way or the other. :)
It’s important to point out that IP addresses are considered PII, therefore storing them in log files requires to have a data privacy policy for all affected persons.
Ah, I -- a lame USian -- assumed that essentially default apache logging was "OK." I've been wondering what's the maximum you can log without data privacy rearing its head, so I guess it's a little tighter than I thought!
Probably you can since you are not setting up any cookie on user's device. But most likely you are using some web fonts or CDN which might set some cookie. So for modern web application it will be really hard unless you host everything yourself and rely on server logs for any analytics.
You have to be careful though if you are processing API request and capturing personally identifiable information. Your logs also need to be GDPR compliant.
Gdpr and cookie notifications are different things though. You can still run Pikmin or something similar on your own infra, set cookies from it, and be gdpr compliant as long as you don't collect pii. (And comply with other parts)
Analog works in CLI browsers against its own HTML files (or via webserver), and among other features comes localized into 30+ languages. I didn't see any metrics in the live demo of this that Analog doesn't have in its default report straight out of the box. Fewer fancy graphics, though.
I think there's no way to get the report to tell me, which Referers led to which page?
The "official" answer seems to be to filter the log file with grep for your condition (page /foo) and generate a separate report. Which is unwieldy, I would love to be able to click through.
We had it implemented for quite sometime, but later on moved to a different platform as it had issues in correctly parsing historical logs. We did try to hack it and wrote some scripts to zcat, output/ merge but in vain.
Were you trying to parse nginx, apache or haproxy logs? I believe it depends on your logging strategy and then in go access you need to define the log format.
It will parse according to the log format [1].
For example
$sed -n '/'$(date '+%d\/%b\/%Y' -d '1 week ago')'/,$ p' access.log | goaccess -a -
$awk '$8=$1$8' access.log | goaccess -a -
$goaccess access.log --log-format='%^[ %^[ %^[%x.%^] %^ %^/%^/%^/%^/%L %^ %h:%^"%r" %s %v %^ "%u|%^" %^ %^ %^ %^=%b' --date-format=%s --time-format=%s
These are just some example but I feel using sed or awk for text processing you can feed most of the common access logs with goaccess.
GoAccess is awesome. The one problem I had with it was that it often had problems determining which requests were from bots instead of real users, skewing the statistics. Maybe if I had more real readers it would make less of a difference.
Don't need to give up users privacy and force them to download an intrusive proprietary JavaScript code on their browser to monitor their behaviors and in some cases pass on private information without their knowledge to third party services.
Hopefully we can go back to old days relying more on these kind of tools to just get anonymized statistics.