Polymath Engineer Weekly #56
Cool stuff ahead
Links of the week
Most people still consider AI Engineering as a form of either Machine Learning or Data Engineering, so they recommend the same prerequisites. But I guarantee you that none of the highly effective AI Engineers I named above have done the equivalent work of the Andrew Ng Coursera courses, nor do they know PyTorch, nor do they know the difference between a Data Lake or Data Warehouse.
I always tell people that fund raising is a sales process. The investor has money and the entrepreneur is selling ownership in his or her company. But the real product being bought or sold is “trust.” Trust that you can deliver on what you say you’re going to do, trust that you will follow up when you say you will, trust that you will be a pleasure to work with, trust that in good times and bad you’ll be committed to making the investment valuable.
And there is no short-cut, no collaboration tool, no spreadsheet, no web conference tool that can build trust even remotely as effectively as being in person.
Although these techniques can mitigate the effects of complexity in large systems, there is still sufficient complexity inherent in the problem space itself that we largely fail to enforce much order in the overall system other than some simplified diagrams that vastly minimize the scope of the problem.
This post argues that maybe we are thinking about this problem from the wrong direction. That is, rather than trying to control complexity in large systems, we can instead embrace and use this complexity to model the behaviour of the system, and then to create the correct incentives that help the individual parts succeed while also benefiting the overall system itself.
Semantic similarity search involves calculating an embedding for the user's question and then searching through my library to find the K most relevant items related to that question—these are the K items whose embeddings are closest to that of the question. However, when dealing with a large library, it becomes crucial to perform this search efficiently and swiftly. In the realm of vector databases, this problem is referred to as "Finding the k nearest neighbors" (KNN).
One of my stock interview questions goes: "When picking between dependencies to use in production, what factors contribute to your decision?" I'm surprised by how often I receive an answer along the lines of "Github stars" and not much else. I happen to think Github stars is a terrible metric for selecting production code, so this post sets out my idea of a healthier framework to evaluate dependencies.
I call these the Five Big Forces. I saw how they affect each other and change in logical ways to produce the Big Cycle that produces big changes in the world order. I came to realize that if one understands and follows each of these forces and how they interact, one can understand most everything that’s changing the world order. That’s what I’m trying to do.
Book of the Week
Do you have any more links our community should read? Feel free to post them on the comments.
Have a nice week. 😉