I want to start a company in the future, so I started allocating a bit of time every day to explore different theses.
I don't have anything concrete yet. However, this post is a (messy) compilation of:
-
a few aspects that I have identified as important to me, and
-
startup patterns that I have observed.
Feel free to jump around. Beware that these notes are not polished.
Important aspects
Startup patterns
Important aspects
Getting the trend right
I got this idea from reading Brad Jacobs' book "How to make a few billion dollars." He mentions that one of the first things he looks for when evaluating ideas is whether there's an underlying big trend going on. If you're in a market that grows much faster than the GDP, you'll experience growth just by showing up every day.
I think I experienced that a bit with Openlayer. In the early days, we were a platform serving data science/machine learning teams. I think the number of potential customers was growing slowly. Then, LLMs appeared, and suddenly, every software engineer became a potential AI engineer. Everyone started looking at AI. We've felt the market pull. Competition appeared but our shipping velocity and number of customers also grew. It's as if the market is pulling the product out of us.
I contrast this to a person I know who is rich because she owns many gas stations. Every year, she buys new gas stations and makes more money. However, I think she is getting a larger and larger piece of a shrinking pie.
It's better to be in markets where the pie is growing.
Intellectual curiosity
Part of what moves me is creating extremely high-quality things. Practicing my craft. I felt a bit of it when I was exploring some ideas this year. I was thinking about re-doing software that is currently shitty and it energized me because I knew I could do a much better job.
I think I do my best work when I'm doing it almost autotelically.
Higher-order effects
People attribute part of NVIDIA’s success to the fact that they “sell shovels during the gold rush.” They enable the AI revolution. However, just as interesting as what enables revolutions is what is enabled by them.
I started thinking about this after talking to Nicolas Vizioli. He sent me this blog post.
An interesting way to capitalize on the AI opportunity is not to bet directly on the tech, but rather, on the higher-order effects that this tech causes. The example Nicolas gave me was game discovery — a thesis he is exploring.
With AI, it becomes easier to ship lower quality/superficial games (AI slop). This aggravates the problem of discovery, since new games are churned out like crazy, and finding the good ones becomes harder. Therefore, game discovery tends to become even more of a problem than it is today.
The blog post mentions historical examples. E.g., railroad: some people profited from railroads directly; many others profited from booms enabled by railroads (e.g., making money transporting oil, fresh meat became available everywhere because of trains, etc.)
What are other higher-order effects of AI?
Is bigger better?
Economies of scale are a big part of what makes some companies successful. In most cases, having a bigger company is better because of it. However, this is not always true. E.g., prestigious design agency that is small, selective and expensive.
I think I want to get into something where bigger is better. Not 100% sure though.
Patterns
Democratize something that is already available for a select few
This is a common startup pattern I've observed. This is what Nubank successfully did: "bank the unbanked."
I've also stumbled over this idea when I was reading about Replicate's story:
In the 1950s, if you wanted to use a computer you most likely needed a Ph.D. in electrical engineering—that’s how complicated they were. But over time, software was introduced that made using computers easy, no matter a person’s background or experience level.
For the greater part of the 2000s and 2010s, machine learning’s useability was akin to mid-century computers, despite its growing ubiquity. Machine learning models were bulky, complex, on-prem systems, inaccessible to most engineers. To run them required expertise in the field and resources only available to larger institutions that could afford expensive servers and GPUs.
Jansson, who had been a machine learning engineer at Spotify prior to founding Replicate, understood firsthand the barriers developers faced in using AI models. “I could read all about new developments in AI, but there were no tools readily available to actually apply any of it,” Jansson says. When he encountered an issue at Spotify, he would delve into relevant research, which would be in the form of scientific papers—not usable code. “Experts would write software, turn it into prose for the sake of academia, and then engineers like Andreas would have to turn their research back into software in order to use it,” Firshman says. “The whole process was bananas.
There are many sectors where this idea applies.
For example, I see the way software engineers use computers (and its tools) as a niche. Eventually, I believe more people will also use computers like this. Shortcuts everywhere. Beautiful and fast UI. AI-native. The question is: what is the broad audience of this? This is what motivated me to explore some ideas this year.
Abstract away complexity from users for tasks that are a pain in the ass
I first thought about this in the context of LLM code writing.
We often deem "writing code" as the main task for a software developer, and LLMs are now great at it. However, running and testing the code are also important tasks that are not trivial with LLMs. I was thinking that a piece of software could abstract away all the complexity of setting up the correct environment, installing the dependencies, etc. from the end user -- and that would be great, because I wouldn't need to bother figuring these steps myself.
A flow that I've followed multiple times this year is to:
- spend 20% of the time writing code in a programming language I'm not familiar with (trivial with AI help)
- spend 80% of the time figuring out how to run/thoroughly test the code written (requires "traditional learning" on my part).
I think that this coding example is too narrow. The principle of abstracting away complexity from the user, though, is general.
What are other situations where this applies?
Break information asymmetries
I remember I had a huge dopamine hit when I first understood why “Open Banking” in Brazil matters. In particular, I like this excerpt (translated from this source):
Open Banking operates on the principle that consumer data belongs to the consumer, not the bank they are associated with.
Currently, Brazil faces significant information asymmetry. For example, if a customer has an account with bank A, that institution holds the customer’s credit history, which indicates, for instance, whether they are a good payer or not.
But if the customer wants to request a loan from bank B, where they don’t have an account, they’ll face difficulties. This happens because bank B doesn’t have enough data to assess the person’s repayment ability to approve or deny the credit. Bank A holds that information. As a result, the operation becomes riskier for bank B, and it is less likely to grant the loan.
The customer then becomes dependent on the institution where they have an account and subject to its rates, further incentivizing the already high banking concentration in the country.
Are there other places where solving an information asymmetry would result in a win-win situation?
Every market where one side is transacting a lot and the other is transacting a little has information asymmetries. E.g., real estate, car dealerships.
I’ve spent some time thinking about the information asymmetries that exists in health insurance. People know their health records but insurance companies don’t. Consequently, insurance companies price the average patient and not specific people.
For example, if I’m a healthy individual, I’d be overpaying. I thought that breaking this information asymmetry could create good incentives – e.g., exercising more, eating healthy, etc. to reduce the amount paid for health insurance.
However, after doing some research I found out that, in Brazil at least, this practice would be ilegal. The economic inefficiency is sacrificed for the sake of equity. In Brazil, the only variable you can take into account to price health insurance are the age brackets.
I’ve also thought a bit about the information asymmetries that exists in real estate. The landlord knows little about people renting their place. They try to de-risk it by requiring a bunch of documents. Is there a better way?
Tech education
I feel like Brazil lacks a lot in tech education and, for its size, doesn't have a lot of successful tech entrepreneurs.
Can we "teach" Brazil into a tech power?
There's nothing fundamentally different between the Brazilian and the American in Silicon Valley. However, the startup and work culture in these two places is very different.
Why?
I feel like the tech education platforms focus only on helping people get jobs. (E.g., Alura, Awari, Trybe). They are not focused on "the craft." People don't learn things deeply to the point where they can walk by themselves. They memorize some things that are important for interviews and then keep job hoping, without building things with quality.
I slightly related question that I thought was interesting is the following, from Patrick Collison's website:
Why are there so many successful startups in Stockholm?
London and Paris have surprisingly few successful tech startups for their size. Stockholm, a city of less than 1 million people, has Spotify, King, Klarna, iZettle, and Mojang, all valued at more than $1 billion. What's true of Stockholm that isn't true of other European cities? (Similar questions apply to Provo, Utah, and Tallinn, Estonia.)
Startups in Brazil lack talent and quality. This is a problem because entrepreneurship is a powerful problem-solving machine.
What's also a bit frustrating to me is that the startups that thrive here are mostly fintechs. I wonder what other sectors could benefit.
I was recently thinking about how to teach tech. My initial gut reaction would be to teach middle/high schoolers. The problem is that this is the time when many of the "serious" students are mostly focusing on the university entrance exams (vestibular). Anything that doesn't help them with the vestibular is a distraction.
University students could be a focus. It's less fun though, because people already kind of formed their worldviews. Many see university as a time to "relax" and do nothing meaningful. Of course, I shouldn't over-index on those.
I wonder if something like Synthesis makes sense.
What would I teach more to my past self?
- Math
- Programming
- Machine learning
- Physical exercise
- Teamwork
- Risk taking
- Communication
- Emotions
I think I'm pretty good at learning. I think a good process would be: learn from canonically good content. Question your knowledge. Discuss with LLM. Ask LLM for guidance. Spaced repetition. Projects.
Internet the un-interneted
This is a special case of Democratize something that is already available for a select few, where the select few are people that can effectively use the internet.
As a software engineer, I can use the internet for almost everything I want to do.
A new bug appears? Google it and check StackOverflow. Evaluating new tools that we might incorporate into our workflow? Google it and read a few blog posts.
LLMs are a statistical summary of the internet. They are useful because the data was open and they could leverage it.
Certain fields don't have an internet yet. (And, consequently, no LLMs.)
For example, pharmaceutical companies. Maybe they'd like to do research on patient data, but that data is private and they can't find it.
What if we created an internet for domains that don't have it yet?
Messy inbox problem
Many domains suffer from the “messy inbox problem.” The term inbox was chosen by a16z, but I prefer input.
This problem happens when the are many data input sources, many of which are manual. After this data is inputted, a cascade of software tools is used. The bottleneck, then, is the input. If the input is solved, it unlocks the whole cascade.
Example:
Take the case of our earlier medical practice. While the job of Anne, an administrative assistant, may be to sift through physical faxes and enter in patient information into their electronic medical record (EMR), she must then pass the baton to Sally, whose job it is to qualify patients and evaluate their eligibility and insurance benefits. Jimmy then requests prior authorization from the patient’s health plan, and Joe schedules that patient’s visit against the limited availability of their physician’s calendars. Each of these administrators use a different software system to complete their work.
Documentation for LLMs
"Reading" is one of the primary activities humans do when they want to learn something.
Furthermore, humans have developed books according to the expected audience. For example, we know it's pointless to give a child an adult book.
We're at a point where many companies want to use LLMs to perform tasks. These LLMs need to be prompted and taught about the companies' needs. However, I don't know if people are taking into account the audience, like we do for books.
A somewhat related thing is the llms.txt
proposal from Jeremy Howard.
Boring but necessary
At Openlayer, we work with enterprise clients. When they want to evaluate a tool (like us) they usually have a long list of requirements that they want to check and a scoring associated with each. Then, each vendor is responsible for presenting or preparing a document showing that they support each requirement.
This year, I spent more time than I'd like to admit preparing these "requirement conformity" documents.
Could this be automated? Maybe. If we maintain good product documentation and AI agents had access to our platform, I bet they could craft a document like this in no time.
This is a task that is boring but necessary. I'm sure there are many other tasks like this all around.
Thanks to Nicolas Vizioli for reading these notes and pushing me to make them public.