Browsers need pipes

Introduction: Websites are not composable

One of the core pillars of Unix can be summarized as "expect the output of your programs to be the input to other programs." This is beautiful because it allows you to chain programs together to execute much larger tasks. Web browsers have moved far away from this principle of Unix. Many websites reinvent concepts instead of relying on an existing tool. I find using a web browser infuriating because of the lack of composability. I plan to explain common frustrations everyone has with web browsers and give a glimpse of how great browsers could be if they adopted a more unix-like structure.

Different websites need the same data from you

A common frustration of mine while using the web is inputting the same data multiple times into different websites. Let's say you own a home and have a friend whose house you often visit. One usually needs to enter the same information when using the internet to accomplish tasks related to these houses. If you need a handyman to fix something on your house and are looking for one on multiple different websites, you'll need to enter the same data over and over (address, sq feet, number of rooms, etc...). If you are ordering pizza, you need to choose between your house or your friend's house, and each different app you use has you enter the address again.

I find this very frustrating and unnecessary. There is so much data about myself and things I'm working on that I have to enter into so many websites: Credit Cards, Resume Info, Vacation dates, etc... It feels so inefficient to have to memorize all of this data and take the time to enter it into every new site that needs to interact with it.

A basic proposal

What if you could store your important data likes address and resume info somewhere else, and then when a site needed it, you could "pipe" it into it? We could have a list of shared concepts like House and `Work Experience' that all websites could accept.

type House {
    address: Address,
    numOfRooms: int,
    squareFeet: int,  
    howToFindDescription: string,
}

Now, when you go to your favorite website to order pizza, and it asks for your location, you click a button, and an "address picker" pops up and lets you pick from the map, choose an address saved on your file system, from Google Drive, etc... The point is the address pick could be separate from your web browser if you pick an address and pipe it to the site. Another cool feature of this approach is the data could be piped into the website without having to click a button at all. You could pick the home you are concerned with first, then pick the site you want to pipe it into.

The website could also write the house data back to the system (with the user's permission). You go to Zillow, pipe your property in, have Zillow calculate the estimated value and save that back to wherever you stored the house. Zillow could save to the home value in two different places on the main home object. There might be a main House.value property that Zillow could update, or Zillow could store its own data on the house like House.zillow.estimated-value. It would do the latter to store more Zillow-specific properties that don't conform to the shared house scheme, but Zillow would like to have access to them if this house is uploaded again.

Sites would no longer need to store your data on their servers; they could rely mainly on the user as the data provider. This doesn't mean they can't copy the data to their servers; they just don't need to build a functioning app.

I imagine "structured data storage" services would start popping up where you can store all this info in the cloud. Google and Microsoft will make easy ways to store your houses and resume information in the shared structured way and have easy ways to pipe that to many sites.

In defense of standards

Something this whole idea relies on is everyone speaking the same language. We can't have slightly different versions of houses, or our sites won't work together. This would need to be an agreed-on standard. I think there are two things to say here: Shared standards like this are good, but we ultimately won't need them.

There are already so many different standards. The W3C makes standards and definitions for many different parts of the internet and how browsers should work. The internet is not foreign to these kinds of standards, so we shouldn't be scared to "add another standard." The question is, how do we make sure everyone uses the same one?

This was mostly solved in 1999 when Tim Berners-Lee came up with the semantic web. It allowed for concepts like "person" and "house" and relationships like "lives in" to be shared across websites. Each concept has a URL like example.org/entities/person, and websites can extend other concepts. This forms a giant structured internet where data can be stored in very similar standards.

This isn't much different from how Google and Facebook sign-in work right now. They make it easier for a website to retrieve basic information about a user. Each of these "signs in with" implementations has slightly different concepts of a "User." But they could conform to some shared User and extend it to add their specific options. Every platform provides users similar properties, such as name, profile picture, email, etc.

Shared concepts will work more when there is little contention or innovation on what something is. We can have a shared User because everyone has the same expectations for a user. The idea of an "Address" has not changed that much. But it is hard to have a shared ChatBot, since we are currently defining what that looks like by adding and removing properties.

Even if nobody gets on board with this, AI will bridge the gap between the sites that use the standard and those that don't. It seems easy to see that we will have systems in the future that can automate inputting and extracting data from websites that don't have easy APIs to do so. If your site doesn't adhere to the API, some automated bot will make it adhere.

Conclusion

If the internet had embraced this composability concept earlier on, I think it would feel much more "fun" to use. A big barrier to accomplishing some tasks is getting the necessary content into the site. This would streamline the whole process. Sign-in with Google took off because it took a huge load off of developers and made auth "just work." I think we can start to have these same niceties for more types of structured data. Now, developers don't need to spend time making form inputs for the data they need; they can rely on browser APIs and do what makes them unique.

In an upcoming post, I'll discuss how this could work for concepts other than houses and how we can string together multiple websites to make a large workflow with this kind of idea.