Doing offline-first well implies that the app has a (sqlite or similar) local database, does work on that database and periodically syncs the changes to the backend. This means you have N+1 databases to keep in sync (with N=number of clients). Essentially distributed database replication where clients can go offline at any time. Potentially with different tech (for example sqlite on the client and postgres on the backend).
When the backend isn't very smart it's not too hard, you can just encapsulate any state-changing action into an event, when offline process the events locally + cache the chain of events, when online send the chain to the backend and have the backend apply the events one by one. Periodically sync new events from the backend and apply them locally to stay in sync. This is what classic Todo apps like OmniFocus do.
The problems start when the backend is smarter and also applies business logic that generates new events (for example enriching new entities with data from external systems, or adding new tasks to a timeline etc). Obviously the new server-generated events are only available later when the client comes back online.
When trying to make the offline experience as feature-rich as possible I always end up duplicating almost all of the backend logic into the clients as well. And in that case, what is even the point of having a smart backend.
you might like the architecture of holo-chain (stupid name, imo), it's still mostly going under the radar (also because they chose to refactor their library from go to rust to improve security):
"Holochain combines ideas from BitTorrent and Git, along with cryptographic signatures, peer validation, and gossip. Holochain is an open source framework for building fully distributed, peer-to-peer applications. [It's] purpose is to enable humans to interact with each other by mutual-consent to a shared set of rules, without relying on any authority to dictate or unilaterally change those rules. Peer-to-peer interaction means you own and control your data, with no intermediary"
good dev intro: https://medium.com/holochain/holochain-reinventing-applicati...
"Holochain is [...] like a decentralized Ruby on Rails, a toolkit that produces stand-alone programs but for serverless high-profile applications."
hmm no i think it does. the 'Resilience and availability' section is relevant for your question: https://developer.holochain.org/concepts/4_dht/
holochain is completely distributed, so there is no 'offline' and 'online' because there is no third party. it sounds weird writing that but i'm hoping it might challenge you to dig a bit deeper into the docs, because it's there. if not, please come back and let me know so i can pass on feedback about the docs that were unclear or unsatisfactory for you.
> does not solve conflict resolution
from the docs: "Holochain DNAs specify validation rules for every type of entry or link. This empowers agents to check the integrity of the data they see. When called upon to validate data, it allows them to identify corrupt peers and publish a warrant against them. [...] (the DNA is simply a collection of functions for creating, accessing, and validating data.)"
How does it relate or differentiate with the GNU/Net project? What naming schemes does Holochain support and could it theoretically interop with the GNU Name System? 
so for all the people who might struggle with this (like my mom), but who still want to enjoy these new apps, Holo (again, stupid name, but this time it's the org who stewards the open source holochain library ) came up with a strategy that allows people to rent out some spare processing power on their computer/home-server, to host holochain applications for the above-mentioned people (so sort of like an Airbnb for AWS). they even ran a successful crowdfund that sold a million dollars worth of hardware for committed early adopters (plug-and-play home servers) .
so essentially the Holo hosting network is IPFS (and Holofuel is like Filecoin), but instead of hosting someone's files, you are running encrypted app code for them so that they can take part in the app/network, without having the required technical chops (the holofuel currency measures processing power). the FAQ does a good job of explaining it a bit more: https://holo.host/faq/
i'm not super interested in the whole holo thing (they did a filecoin-type crowdfund to pre-sell hosting credits), yet i am glad they did it because it meant the team could ramp up development. they are already alpha-testing now.
about marx and engels. i am personally most excited about the potential for http://valueflo.ws on top of holochain (library called hREA) , because it will allow us to move away from today's Enterprise Resource Planning (ERP) software, into a new paradigm of Network Resource Planning (NRP) software. i hope it will have a big impact and enable the growth of the democratic and transparent supply chains of the next (socialist) economy.
> How does it relate or differentiate with the GNU/Net project? What naming schemes does Holochain support and could it theoretically interop with the GNU Name System? 
i'll take a look at it, i'm not too familiar with GNU/Net.
> I always end up duplicating almost all of the backend logic into the clients as well
Yes it looks like event sourcing. But most of the classic event sourcing implementations I've seen are mostly backend only.
The project I'm currently hacking on has an MQTT broker that is used by both clients and backend and the events can come from anywhere. For example when a client sends a `location_updated` event, the backend reverse-geocodes this into an address and possibly sends out `place_entered` events, or sends out notifications for Tasks that are now relevant on the new location, or "completes" a "go to location X" task resulting in a `task_completed` event.
Enabling all this in an offline-first paradigm is hard and requires a lot of duplication, I am very close to just saying screw this and requiring network connectivity for most of the features.
While it is true that you have to duplicate the mutations in the basic setup, you do not have to share the querying/reading code as it lives primarily on the client.
My daily environment is mostly JVM-based (desktop/android/server) so your project is probably not a great fit for me but I'm definitely going to look into it for some inspiration.
I discovered this problem in my domain about two months ago. My solution was to split my Aggregate Root into smaller pieces, to use DDD terminology. In my domain, each user/client owns their own data. However they may choose to publish that data and have it be publicly visible so others may view/comment/copy/pull-request it. It's basically Github for flashcards. So, I have an Aggregate Root for publicly visible flashcards, and another Aggregate Root for a user's personal flashcards. It helps that there are distinct behaviors for each Aggregate Root - there's no real point to leaving a comment on a personal flashcard, and there's no real point to "studying" a public flashcard (because that implies logging your study history to the public card). It does mean, however, that there needs to be a translation layer - it should be possible to convert a private flashcard to a publicly visible one, and it should also be possible to copy a public flashcard to your own personal collection.
Obviously this is very domain specific.
Heck, as much as I don't like them, even Facebook got this right last I knew when I still used their app years ago. It never required a network connection. If you weren't actively receiving updates, it just showed you the cached last known feed, accepting that it wasn't up to date. And if you tried to post something, it would just cache that too and wait to send it. They didn't invent eventual consistency, either. It's been a basic operating principle of distributed organizations, especially armies, for thousands of years.
It's based off something I call "source derivation"....
It has data sources where things get stored, repos that handle syncing and pulling data, and ents which represent the data but add functions to it.
It’s built on the incredible PouchDB (the original offline first db) which is a truly feet of engineering. However I feel it is somewhat neglected (not implying that the original maintainers have any responsibility to the users, this is free Open-source, the fact they gave it away in the first place is brilliant). When I last used it about 6 months ago there had been very little activity in years on the github and I am concerned about their use of automatically marking inactive issues stale after 60 days. There is so little activity the only issues that are open are ones from the last 60 days.
I found a sync conflict (basically a hash collision, the attachments are unintentionally stripped from the document before hashing) and submitted a ticket, unfortunately as there is so little activity the issues was marked as stale and closed automatically. It’s not a bug that many people will come across but is a legitimate issue for anyone using binary attachments.
So, rxdb is really great but I would be cautious about using it when the underlying platform is looking a little neglected. I truly hope someone has the time to take it on as it is an incredible toolkit.
Use a voting system or explicitly prioritize work instead.
If you've ever worked on a project with 1,000+ issues, you'll know how big a difference small stuff makes when it comes to efficiency, especially for people who maintain the project. GH Issues really is lacking in so many ways, and as a result people come up with all kinds of crazy automation strategies to help make the issue database more "useful" to the developers, even if it's basically second class automation. Stalebot is one of these. The idea is really (at heart) that you just want to keep the open ticket count low because that's one of the simplest Signal-to-Noise ratios you can use as a search filter or mental criteria. If you have better tools (more powerful search, customized forms, powerful tagging, etc), this isn't true, but you have to play the hand you're dealt.
I don't think it's a good strategy, mind you. But I think understanding these recent trends as an effect of older, more fundamental causes is worth pointing out. This is all based on my experience, mind you. But it helps understand the thought process. And people see these tools being used on their big projects, so they kind of naturally gravitate (or at least try them) out of curiosity.
Two issue trackers that are substantially better than Issues are both Maniphest (Phabricator) and Trac, for curious people. Trac was a bit annoying to run and I think effectively unmaintained now, but as a bug tracker it's actually still really good. (It was also small and easy enough to hack that we were able to make modifications to fit our workflow, and maintain them for a long time.) I still miss both of them a lot every time I open a big project on GitHub and have to start searching for issues... Here's hoping "GitHub Issues 3.0" will get some things right and they won't wait another 10 years before doing major updates.
It only becomes a problem if you leave so many issues untouched that that becomes a chore.
Issues don't stop being real if someone out there runs into them and can't find a solution because the issue has been closed. Issues very much stop being real if the only person in the world who cares about them decides that they don't care or solves it themselves without sharing their wisdom. The former hurts the people whose problems are considered unimportant, while the latter hurts the developers of the project who now have an essentially dead issue in their tracker.
The solution? Close stale issues, but with the possibility of letting people say: "Hey, this is closed but is now relevant to me, so let's reopen it."
That way, there are never open issues that no one cares about, while the ones that someone starts caring about can be reopened, until they're either fixed or no one cares about them yet again.
- stalebot: closed due to inactivity
- reporter: still an issue
- stalebot: reopened
- stalebot: reopened
- the problem persists and hasn't been solved to date
- no one actually cares enough to solve it because the discussions just die out until someone runs into it
- the problem is also unimportant enough for it not to warrant either a resolution, or getting closed by the devs as a "won't fix"
An activily maintained SvelteKit/RxDB starterkit with built in auth might might get RxDB some new fans.
I am aware of that problem. Pouchdb got some good love in the last weeks where some people made good PRs with fixes for the indexeddb adapter. But still it is mostly unmaintained and issues are just closed by the state bot instead of being fixed.
So in the last major RxDB release  I abstracted the underlaying pouchdb in a way that it can be swapped out for a different storage engine. This means you could in theory use RxDB directly on SQLite or indexeddb. In practice of course someone has to first create a working implementation of the RxStorage interface.
Thanks for the chuckle.
I have a pretty simple strategy though: I use SQLite on the mobile devices and when the device is back within network coverage, take a copy of the SQLite database, zip it up and throw it up to the server. It ends up in an S3 bucket (all the device database backups end up with a UUID as their name), kick off an automatic process via a Lambda (triggered by S3) that imports the SQLite database into the bigger DB, job is done.
It works pretty well, SQLite databases compress REALLY well. The only tricky bit is having to check the version code of the database in case I have an old version of the app floating around in the wild (it happens).
What you're doing here is not much more than a backup.
These "merge conflicts" seems like something that must be handled in a custom way for many apps.
It does open up some possible vulnerabilities like can the user overwrite other people's information but mitigating that requires the same validation layer you should have anyway.
At a minimum you have to follow  but you don’t get to say “it’s safe to open malicious files or process unrelated queries“ and “SQLite has a good security track record because all our CVEs are only from untrusted queries and malicious input files and CVEs are useless anyway“. Those are facially contradictory positions likely written by different team members that reflect their individual perspective rather then there being a well thought or security stance (at least in my opinionated viewpoint).
Hopefully your CSV parser has fever vulnerabilities.
You are opening an untrusted binary file using SQLite on your backend. This is 100% not safe.
You should convert to JSON or some other serialization before you send it, then your API should only accept JSON. Zipping a SQLite database is not a good serialization method... Accepting and opening an arbitrary sqlite binary file is asking for trouble.
- a binary file that's used for storing data, like an SQLite database
- a text file that's used for storing data, like an XML or JSON file
Those problems could be addressed in a pretty easy way, plus if you're security conscious, just run the importer in an entirely separate container which will basically be single use (possibly distroless, if you want to go that far), with resource limits in place.
But that's not my point. My point is that both of the data formats should be pretty much equal and them not being so in practice is just a shortcoming of the software that's used - for example, even spreadsheets ask you before executing any macros inside of them. There definitely should be a default mode of addressing such files for just reading data, without handing over any control of the computer to them.
> Zipping a SQLite database is not a good serialization method...
Therefore, with this i disagree. SQLite might be flawed, but zipping an entire dataset and sending it over the network, to be parsed and merged into a larger one is an effective and simple solution. Especially, given that your app can use SQLite directly, but probably won't be as easy to make while storing the state as a large JSON file, which will incur the penalty of having to do conversion somewhere along the way. Here's why i think it's a good approach: https://sqlite.org/appfileformat.html
Who's to say that JSON/XML/... parsers also wouldn't have CVEs, as well as the application server, or back end stack, or web server that would also be necessary? In summary, i think that software should just be locked down more to accomodate simple workflows.
Okay, but you need to defend against reality, not against what could in theory be possible.
Sandboxing is a pretty good solution, at least.
> Who's to say that JSON/XML/... parsers also wouldn't have CVEs, as well as the application server, or back end stack, or web server that would also be necessary?
Raw SQLite files are a huge attack surface that isn't directly designed to be secure. JSON is an extremely simple format that can be parsed securely by just about anything (though occasionally different parsers will disagree on the output).
XML, a data format explicitly designed for interchange where parsing untrusted input was a design goal of the language.. contains ‘external entities’, which permits the person crafting an XML doc to induce a vulnerable reader of the document to pull in arbitrary additional resources and treat the data from them as if they came from the document creator.
There are all sorts of confused deputy attacks you can perform via this kind of mechanism.
If XML can have that kind of issue, when it ostensibly contains no arbitrary execution instruction mechanism at all, how can you expect a sqllite database file, which can contain VIEW and TRIGGER definitions, to be safe?
It’s unsafe because the attack surface is so large and the use case of an untrusted attacker isn’t something strongly considered.
> In my eyes, the following should be equal:
In an ideal world maybe, but this hasn’t been true for the last 50 years.
For the second point, I would say that SQLite has a massive attack surface, it would be very difficult to ensure that that technique can’t lead to an exploit of some form.
Unless there is some way to introduce a malicious side effect to a select statement in sqlite?
If you depend on users (attackers) not being able to modify their software or environment and poke around at each and every bit of your (publicly accessible) interfaces you are doing something awfully wrong!
> but the user needs to be authenticated to upload a database
Is registration for your service limited to a fixed amount of trustworthy people? Otherwise this isn't an obstacle.
> the lambda that extracts the data is sandboxed to access only what it needs
Using a simple serialisation format would be orders of magnitudes safer (and simpler)
> Unless there is some way to introduce a malicious side effect to a select statement in sqlite?
See all the links posted here already
As for introducing a malicious side effect into the query, that's simple: just add an UPDATE, DELETE, CREATE, or INSERT. When you say that the importer can only run SELECT statements, do you mean that it's only authorised to make SELECT statements, or are you simply assuming that the importer won't be able to mutate any data? Because I suspect it's the latter, and that's not correct. I really truly hope your application is not responsible for anything important.
Download sqllite file from S3
Mount file as a sqllite database
Execute a statement like SELECT * FROM userData on the mounted database
Connect to an online database and insert the returned data into an importedData table for later validation and integration
What can go wrong here, given the user has control over the sqllite file? How could someone who has observed that your system uploads a zipped sqllite file craft a payload to do something malicious?
Well, that code would run just fine even if userData was not a table - it could be a view. That means the data returned to your query doesn’t have to come from data in the sqllite file they uploaded, but could be being calculated at select time inside your process. Are you sure there aren’t any sqllite functions that a view could use that can read environment data or access the file system? If there are, they could get your code to import that data into their account - data that might easily include S3 access secrets or database credentials.
Are you also sure there’s nothing in a sqllite file that tells sqllite ‘load the data from this arbitrary external file and make that available inside the schema of this database’? Then a view could pull data from that context.
Maybe that would let someone craft a sqllite database that imports the contents of your AWS profile file as if it were one of their data values.
Now, I did take a look at the sqllite sql syntax and I will say I don’t see anything that looks immediately exploitable (no ‘readEnvironment()’ built in function or anything) but that doesn’t mean there’s nothing there (are there any undocumented test features built in to specific implementations of sqllite maybe?). But the question you need to consider is: Mounting fully untrusted db files just might not really be a vector sqllite is built to defend against, in which case that puts the onus on you to be sure that the file is as you expected.
Also, where are you left if in the future a new version adds a feature like that?
ANY mechanism along these lines that lets a sqllite db pull in environment or file system data would make this system exploitable within the bounds of sqllite, even if sqllite contained no ‘vulnerabilities’ like buffer overruns to maliciously crafted files.
And the crazy thing is, these kinds of vectors have shown up in data exchange oriented file formats like XML and YAML, so it’s honestly prudent to assume that in a richer format like sqllite they are almost certainly present until proven otherwise.
That's simply not true. There is no internet on many parts of a train journey. There is no internet in parts of the underground rail network. There is no internet on the plane. There is no internet in nature. There is no mobile internet outside the EU, until you get a new SIM card. There is no usable internet in some hotels.
If you travel a bit, you know that internet is far from guaranteed.
I visit my parents most months. They live in rural Ireland. Not like a one off build down a back road miles from civilisation, but a small village of a few hundred people. They absolutely do not have reliable internet, and at this point they've cycled through every available provider.
To visit them I take a few hours train journey. For about half this train journey the internet is not reliable, whether edge/3g/4g or onboard service.
Before the pandemic, I visited the US a few times a year, United's onboard internet is very limited, expensive, and reliability is "it sometimes works".
So I regularly run into cases where I need to prepare ahead of time for no/spotty internet and still get surprises as some app refuses to launch because it decided now is the time it needed to update over a connection that's doing single digit kilobytes per second, or speak to a license server or whatever.
Nearly every thread involving cars or traffic also makes this abundantly clear.
I live in rural PNW. Electric is at LEAST 20 years from being a feasible and reliable solution. ...and I say that as a civil engineer with some specialization in creating EV charge stations. 30-50 might be even be more plausible.
When I'm in the field, small things can turn into life-and-death situations pretty quickly out here when you're an hour and a half drive from cell reception, followed by another hour and a half drive to town with a small hospital/sheriff station, and you're up a dirt road where a tow-truck simply won't come. And the geology is notoriously unstable and slides/washouts/fallen-trees happen constantly.
My clients LIVE in those places, I only visit. They drive big diesel rigs (They often NEED to transport big heavy stuff, and you can store diesel for long time periods in a tank onsite), and a lot of them also have transfer tanks in the back of their truck for extra range. (Because it's needed!) They also almost all carry food/water/shelter/A chainsaw/tools/a gun/etc in their vehicle for a reason. Self-rescue is all you got a lot of the time.
But, to be fair, there are a lot of folks who will say 'I'd never go without a car' that haven't lived in a place with great public transportation. I used to think that way as well until I spent six months living in a place that had great bike paths and lots of options for local and regional public transportation. I started off with the intent of trying no-car daily living and it was perfectly fine. In six months I rented a car one time for a weekend trip, that was it.
So to me the lesson is to try to have a bit of empathy for the personal experience of the person making bubble points and focus on expanding their perspective rather than debating their position.
This leaks into the misinformation topic as well but there's another thread for that :)
It's a pretty big country too, with shitty roads and pretty low population density.
You're falling into the same trap that the article and GP's are highlighting: your experience that a FWD vehicle is enough "for me / my relatives" does not apply to this situation.
1) survival in animal / vehicle collusions with bears, deers, and moose
2) delivery of goods, including building material, animal feed, human food, etc
3) lack of even basic road infrastructure maintenance by government or municipality
4) "over prepared is only prepared", you're it, you're on your own
I lived for 20 years in a semi-rural area; you could certainly live for 99% of the time without a large cargo vehicle or 4x4. The other 1% you were chancing your life. Cargo and deliveries were certainly an issue though. Now just slide that ratio towards the middle.
Also, I don't think "licensed hunting firearm for wildlife protection" is quite relevant to "gun rights". You don't need to buy your fifth AR-15 or full auto Uzi at the grocery store to protect yourself against bears.
However, I live in an area where stumbling upon foreign cartel marijuana grows is relatively common. They are known to aggressively defend them with firearms which are not legal to own in the jurisdiction they're in. It's also not uncommon for a truck full of guys intent on committing armed robbery to roll up onto a client's property.
An AR-15 would be an ideal defensive weapon for that use-case.
(I still think gun regulation is a good thing, with some background checks to reduce the chances of people with e.g. psychosis getting their hands on assault rifles.)
In case any parts of my comment were unclear, I'll try to reiterate or clarify.
First I'm genuinely curious about what kind of wildlife would pose a threat to the point where you need to defend yourself by carrying guns in your car, and where you would risk such an encounter. I try to convey that this is curiosity more than criticism by comparing it to the genuine need on Svalbard.
I then address their claim about "gun rights". My point is that protecting yourself from wildlife isn't about gun rights. No one (as far as I know) is looking to ban a licensed hunting rifle or high-caliber handgun where the owner would need it for protection. My point is that many "gun rights" advocates want to have five AR-15 or a fully automatic Uzi—guns that are highly capable of killing a high number of people and not very effective against bears. In other words, I understand their clients' (legitimate) needs but I don't think it's relevant to the concept of "gun rights".
This is why state run services matter, Because they will 'serve' in areas where private businesses cannot turn up profit and so they have no incentive to serve.
5-6 years ago, In India only the state run telecom - BSNL's service would be available at remote, hill stations. But with 4G it couldn't keep up with private telecom and the company is nearly done for. So now again there's no connectivity in remote areas and mountains as private players don't bother ~~serving~~ doing business there.
This is especially worse with pandemic, many in such areas have lost communication outside world and remote-education is non-existent for children there.
Recently I was on a flight and prepared a book on my iPad the day before. iBooks decided that it was a good idea to “offload it into the cloud”, a book that wasn’t even 24 hours on my device with plenty of space available. Who knows…
Yeah, that's a dick move when I'm not on wifi or don't have data access at all on a long hiking trip.
I had the same problem with Google Books on my Galaxy devices. There's a feature that will let you pin the book to be available offline. But you have to do that for every book.
It really takes away the usefulness of the devices.
For example, a grocery market near me somehow manages to attenuate cellular signals so badly there's no mobile connection more than 3 meters inside. Meaning no Internet when I walk between shelves, and no Internet when I stand in a queue for 10 minutes.
(Sure, the building has plenty of wired and wireless Internet connections in it, but I'm not allowed to use any of them.)
And like GP the nearest grocery has basically no signal inside the store, it only picks up beyond the checkout lanes.
Which is sort of ironic in that modern communications is dependent on radio. Older buildings tend to be better for cell reception.
Yep that's pretty much what I inferred at the time.
It was pretty frustrating back then though, good thing I moved in in late summer and opening the windows any time I needed to do admin (either online or by phone) was fine, would have been rather annoying in winter.
> Which is sort of ironic in that modern communications is dependent on radio. Older buildings tend to be better for cell reception.
Indeed, and moving from an older building (where reception had never been any issue) is exactly what I was doing.
My home was built in 1830 and 2.4Ghz wifi is strictly line of sight within the building, and cellular phones must be kept near windows on the side of the building facing the nearest cell tower to function.
And this isn't in a rural area either, this is life in a brownstone in the middle of a large city...
That must be hell.
But the point is, those 10 minutes are exactly the "low quality" time I could spend doing something useful with your app, instead of using "high quality" time for it (e.g. when I'm in my office). But if your app doesn't work off-line, I can't do that.
With boredom I would just read ebooks, practice that goes back to very expensive mobile internet era and a symbian s60 phone. I knew a queue was epic when I had to go online to download another book because I finished one end-to-end while waiting...
To make it funnier, the files were essentially a Palm resource container with simplified HTML, and I had a reader on pretty much everything from my small 320x320 screen of nokia e51, through le random windows mobile GPS, various android devices, PCs, etc. :)
The more interesting point is that "internet access is flaky" isn't necessarily a good argument for going offline-first. Rather, it only suggests that you need your client to be more resilient to being knocked offline in an online-first world. Rather, the article argues that an offline-first is interesting unto itself as a completely different architecture that's closer to a desktop application where the binary is delivered through a browser.
Put differently — they're not trying to argue you should make Gmail work offline, but rather that you should consider structuring your application as Thunderbird inside the browser.
Some MNO base stations are powered by solar and only operate a few hours a day, sometimes a full day and sometime a partial night (depending on batteries, if they have).
Not everyone has electricity, so people walk to town to charge their phones at spaza shops for a few minutes/hours. Fiber/POTS is non existent.
Outside of larger town and cities, reception is non-existent or spotty at best. You might get reception near a highway.
My current provider dies on me at least once a day and the speeds are atrocious. Mobile network is also weird. Every time I go into a store for shopping I lose my connection. Once I come out I get a cute text message saying "Welcome to Germany, don't forget to get tested"
My flat is the same, essentially no signal inside, just open the window and I immediately get full strength LTE.
Ask your provider for a microcell, if you want to change the situation (and often waste less battery, unless cellular is disabled smartphones tend to dislike not having signal or having poor signal quite a lot).
Their opening remarks about poor internet access now being rare is followed by
>So do we even need offline first applications or is it something of the past?
Their premise isn't "now internet access problems are solved we should use offline first ", it's "even if we we say that internet access is solved, we should still use offline first".
Even if we have perfect and constant connections, you still obtain benefits by writing in this model: for instance, if you assume you have a constant and perfect network connection, you can connect a websocket to the server to ensure you always have the current data for each page and data type. Or, you could follow the offline first model and have a singe update/subscription system to mirror the database locally.
I'm very nervous about their presentation. It says "offline first" and "websites lie because they show you the current state of the data at the last time you had a network connection". If the latter is a problem, a lie, then you can't possible write offline first. You might write using a model that works equally well online and offline, but you necessarily accept forking data and multiple current representations if you allow a computer to show the data without being connected to the authoritative repository of that data.
I think the author has many good ideas, and might have a very good implementation of a very good set of ideas, but this intro page reads like the sort of thing that gets misinterpreted a dozen times and you end up with something worse than current interpretations of "premature optimisation is the root of all evil".
And even if there is, there may be:
- a small bandwidth
- a limited contract
- a huge ping
- regular network errors
And even with a good, unlimited network, local will always be snappier than doing a round trip.
I design my websites with those trips in mind.
In short, there are many potential differences between ping and actual latency.
The output of the program are RTT minimum, max, average, mean deviation and percentage of packet loss.
You can also measure latency over UDP and TCP using other tools. For 99% of practical use cases there is actually no meaningful difference.
Ikea in Ljubljana (slovenia), vilesse (italy) and klagenfurt (austria)... no mobile reception inside. They have wifi, if you remember to connect to it.
We could have internet everywhere if our governments gave 2 shits and wasn't corrupt as all get out.
I love complaining about the lack of effiency in the EU and the West as much as the next person. But you're comparing the West's "bullying" with a region in the world that consists of quite a few harsh dictatorships that do lot worse than bullying. I take a sketchy phone signal on a train over systemic human rights "bullying" any day.
Also, some buildings change over time in how they attenuate, especially freshly built ones where the walls are still "drying" can have close to 0 reception inside. When my parents built their current home, I had to keep an informal map of where the signal was strong enough to use GPRS (yay 7s ping in MUDs) and for voice we usually went outside.
We have agreed upon solutions for:
Api client (fetch, axios, react query, etc.)
Ui state management (redux, mobx, xstate etc.)
Ui frameworks (vue, react etc.)
But theres an intermediate piece between the api layer and the ui state that is always implemented ad hoc. We need some common solution to caching the api data, reshaping or querying it, and managing multi api workflows (for example a synchronous process where the frontend calls multiple microservices). Most answers to this end up being something like “get ur backend team to implement better apis” but thats not realistic in many cases.
frontend schemas differ from backend schemas because they need to support sync, but it should still be possible to inherit the backend schema, transform it in predictable ways, save a lot of work
'full stack schemas' would be a 'pit of success' change IMO
Haven't tested it though yet.
This skips completely over the simpler options of not having a server at all. I guess because this is an ad for RxDB.
Edit: to be clear, I'm sure this paradigm is useful in many applications. But it strikes me as odd that something called "offline first" doesn't seem to include the possibility of software that runs entirely on one's own computer.
The RX here means it is reactive - i.e. you can subscribe to state and react to it. This is how it updates the display across independent tabs when the data changes, for example.
THATS the main point of RxDB, the sync is just another feature (which you dont have to use and isnt even a default).
EDIT: Im nothing to do with RxDB, and dont use it - but i have investigated it previously.
i couldn't tell if they are being serious or not. they probably are, which is kinda depressing. "offline-first is a software paradigm where the software must work as well offline as it does online."
“Traditional” applications on desktops/laptops/similar where offline-only, some later getting online sync as optional features, so offline-first doesn't need to be stated as it is the assumed default when something isn't offline-only.
Web based applications started out online-only, only later in their evolution sometimes getting the ability to work properly offline. Offline-first is still an unusual property, and may forever be, so gets mentioned where it is relevant. Many are simply “works offline” where offline operation is bolted on as an afterthought and may not be at all optimal (for instance in the case of two edits to the same object the last to sync automatically wins, clobbering the other with no attempt to merge or branch and no notice that this has happened or is about to happen, and no care given to the consistency of compound entities when this happens).
Apps for phone & tables fall between, so the matter is more vague. I have heard offline-first in reference to them but usually they are either online-only or “works offline” (offering little or nothing more than buffering changes until a connection is available). Some, particularly games, are like traditional apps (offline only or offline with some sort of online sync/backup).
I've been building offline-first apps for quite a while in both desktop and mobile space.
I have a different definition of what an offline-first app is:
Offline-first apps are apps which can run and function completely offline or without needing the internet for an indefinite amount of time. To offline-first apps, providing all functionality offline is the "primary objective" and any online functionality such as syncing to cloud is secondary.
Also, I personally don't consider an app using temporary cache to store data to be an offline-first app. It must use a local database. Sometimes the "offline-tolerant" apps are confused with offline-first apps. Offline-tolerant apps often provide partial functionality and eventually need an internet connection to sync data.
From the website (https://rxdb.info/adapters.html)
> Uses ReactNative SQLite as storage. Claims to be much faster than the asyncstorage adapter. To use it, you have to do some steps from this tutorial.
Some other projects which will help you implement the pattern that are worth checking out:
Replicache  - real-time sync for any backend. Works via simple push and pull end points and is built by a small team of 3 devs with decent browser xp (Greasemonkey, Chrome, etc)
Logux  - a client/server framework for collaborative apps. From Evil Martians, well known for: postcss, autoprefixer, browserlist etc.
RoomService also used to be in the space but recently left it to pivot to something else.
The largest problem you’ll end up solving is conflict resolution so having a good understanding of the tradeoffs involved with your (or the underlying) implementation is key.
Cell service is definitely spotty out here. The final selling point on this home was that it's serviced by AT&T Fiber. So I have gigabit internet. If AT&T goes dark for some reason -- I lose power, they lose power, someone cuts the line, whatever -- then I can't access the internet and can't call anyone unless I drive for a few minutes.
Offline first is a good thing to have but we do not have "better mobile networks" and "no internet" is only a rare case if you've stuck your head in a city.
At my folks' place on the other side of Houston there's zero bars of cell service sitting on their couch. But if I stand up then I get full voice and data cell service.
My own testing had my web browsers choking up when I had a few thousand documents stored in the browser's IndexedDB. This is on a 10 year old Mac Mini with 8gb ram. Could be that newer PCs do better but I doubt they'll do much better.
Using CouchDB with PouchDB.js provides a "Live Sync" option that syncs data both ways and that feature works very well with the apps I've made which do not have 1000s of users accessing the same DB. In my case there are probably not more than a dozen users accessing the same DB. And in my case there is not much chance more than one user is modifying a document at any given time.
Also, in my case, there is no "backend logic" being processed. That's all done in the user's web browser.
Having a datastore in every device, that now need to sync is giving me headaches already. Syncing is potentially very hard.
After all, if a client is offline, nothing they do can affect anyone else until they go back online.
What was simple form validation is in "offline first" something akin to a mandatory content moderation step.
They were forced to go down this path as it was clear reddit would add their own image hosting and remove the need for imgur so the site would have to be self sustaining.
I regard it as the definitive exploration of local or offline first software. They end up building an offline-first Trello clone which can sync with peers locally or on the internet.
Apps like Notion feels quite sluggish to me on a higher latency 15mbps. And Figma is pretty bad on a decent 3g when loading files on a fresh load.
Building a proper syncing solution isn't that simple especially when it's for multidevice and requires conflict handling. Replicache looks quite good to me but I haven't found any similar solution that's opensource with MIT/GPL license.
We've all experienced this: you want to click a button (or select an input field, ...), but right before you do it moves away. Maybe something finally loaded which pushed the content down. Maybe some content was synced in (as is the case in the UX examples of this article).
The solution is, afaik, well known: add a UI element (that obviously does not move other elements) that informs the user that "new information is available".
For instance GMail's yellow drawer informing users (a) new message(s) is available in the thread.
(addition) Some benchmarks
(edit) You can even measure it yourself
Instead, I've been working on my own library which also uses LMDB and flatbuffers. It's C++ only and still a WIP, but in case anyone else is interested, it's here: https://github.com/hoytech/rasgueadb
Just spitballing here: maybe there’s a way to use Redis (and RediSearch) compiled to WebAssembly and then use it on client side?
 - https://www.youtube.com/watch?v=cmGr0RszHc8
I tried in the past build my own but is truly hard!. I make a "event log" database but it sometimes could become so big that fill the server disk, and most events become uselles after a while. So figuring what is truly an "event is not as obvious.
Exist a paradigm that could work here? I using sqlite + postgresql (and can't move to any fancy nosql).
Yeah, and allowing developers to just access state from anywhere and mutate it doesn't generate lasagna code at all.
The issue so far with RxDB I see is how silly complex the syncing gets. You just see it doing its thing and hope whatever payloads it sends are optimal for your use case. And while offline-first seems neat in theory, it's not necessary for most web apps IMO. For desktop or mobile apps it's a different thing, but they have other options too that browsers don't allow.
I'm hoping this thread produces some good choices. Ideally, other than account creation (which seems like requiring internet is fine), everything caches automatically and syncs magically. Offline account creation would be great, until you want to verify people's email addresses or use OAuth
The job of the server is syncing clients and not keeping the data way from the clients. Good examples are Git or any reasonable mail-client like Evolution or K-9.
Bad examples? The GMAIL-Website and especially the GMAIL App, if you scroll too far you have to wait and hope that the servers are reachable and working.
Old school programmers know how to read- and write files on the local file-system and load it into the free store. Things work fast, stand-alone and reliable. But this people are more expensive.
Of course, because Apple wants to make the user “want” an app.
IMHO reliable replication is one of the hardest thing to code right.
I still have nightmares.
Last time I did something similar I had to roll my own Rails API and client-side cache.
Anyhow, a way to force this behaviour in APEX is to make every user interaction a write action on the DB. This way you either save locally or to the backend (but you don't have to worry about the sync between the two).
Each database uses an arbitrary byte string to mark a position in a sequence of updates to the database. Each document has the sequence counter of when it was created/updated/deleted. This sequence counter only matters to that particular database (it does not need to be mirrored, it's just a local ref. of WHEN the doc was modified in that particular database).
Syncing is then a process of looking up the last read sequence counter from a checkpoint document (i.e. what was the last modification) and passing that sequence counter to the `changes` endpoint to get a list of all documents with a sequence counter AFTER that, and then pulling/pushing those documents to the local/remote database, and saving the new latest sequence counter.
The official docs  give some more info. One key point is if I modify a document several times between syncs, it will only show once in the changes feed with the latest sequence counter for that document. Couch's conflict resolution strategy is a topic for another time, but an interesting one.
The only place I've seen "offline sync" work is where MongoDB was used server-side for the sync, which then gets synced with postgres - in this case the models are forced to be the same.
But then you still have the problem of which side to sync first: postgres to mongo or mongo to postgres? What if a conflict arises? And having a device turned on after being off for a month WILL cause issues.
I don't think there is a real solution here yet. At least not one that is perfectly automatable.
For simple apps with not much logic, you can get away with different front-ends/offline-syncs adding their own rows, and then only respecting the latest row in the main db, usually based on timestamp. But if you have something more complex where multiple workflows get triggered as side effects, then you may run into trouble when the latest row is in fact not the truth of the reality that we wanted, so there is still some risk ending up with states or workflows based on a false reality.
This can become a nightmare if you a have few scheduled events/tasks/crons that does things periodically, thus the only way to mitigate that is to fully embrace eventual consistency and idempotency, and the "easiest" way to get there is to embrace the actor model paradigm (see erlang, akka/akka.net, Orleans framework, F# mailboxes, go-routines, etc).
Point being, comparing two versions of applications against each other is not enough - you may need to version your data too and use timestamps to make a final decision on which version is truth. You may also need to set or build a tolerance system to say it will only sync "old" data if within x amount of days, lets say less than 7 days old. And so on.
However, I gave up on CouchDB after my server kept getting hacked by crypto miners. I'm sure whatever exploit they were using has been patched, but I'm hesitant now to use a DB that's open to the world.
Particularly for a complex document like a report with hundreds of fields and arbitrary sized lists for comments/observations?
Perhaps between tabs, but what about multiple devices/browsers?
ignorance is wisdom.