Polymath Engineer Weekly

Share this post

Polymath Engineer Weekly #46

weekly.polymathengineer.dev

Polymath Engineer Weekly #46

Read more, learn more

Felipe Alcantara
Apr 25, 2023
Share this post

Polymath Engineer Weekly #46

weekly.polymathengineer.dev
Share

Hello again.

Links of the week

Expiring vs. Long-Term Knowledge

The point, then, isn’t that you should watch less CNBC and read more Ben Graham. It’s that if you read more Ben Graham you’ll have an easier time understanding what you should or shouldn’t pay attention to on CNBC. This applies to most fields.


Breaking the limits of TLA+ model checking

Take for example observational determinism (OD): does an algorithm always converge on the same answer? Let’s use the example of four threads, each nonatomically incrementing a counter once. To show that the final value of x isn’t always 4, you just have find one trace where it’s not 4. But to show that the final value is inconsistent, you have to find two traces that get different answers! The TLA+ model checker (TLC) can do the first but not the second.


What are Vector Embeddings?

In other words, when we represent real-world objects and concepts such as images, audio recordings, news articles, user profiles, weather patterns, and political views as vector embeddings, the semantic similarity of these objects and concepts can be quantified by how close they are to each other as points in vector spaces. Vector embedding representations are thus suitable for common machine learning tasks such as clustering, recommendation, and classification.


Share this with your friends ;)

Share


Becoming a more self-directing Staff+ individual contributor

It’s easy to be led. When we first start out as developers we have little to no autonomy over our day-to-day: our manager hands us tasks, provides us with context, and adds us to meetings. We’re in the warm embrace of certainty. Once you become a high-level individual contributor (IC), being directed quickly nets you a calendar stuffed with recurring meetings, “just a quick question”, one-on-ones, and more, leaving little breathing room or time for proactivity.


The curse of being good in IT

Surprisingly, many who make complexity their bread-and-butter think everybody doing the same job as them share similar traits, that they are not special.

However, not everybody can go keyboard only on a border-less tile windows manager, be productive for months, and not burn out. Actually, most people can't.

Scratch that, most devs can't.


How does GPT-3 spend its 175B parameters?

So as I was reading up on transformers, I got fixated on this question: where are the 175 billion parameters in the architecture? Not in the literal sense (the parameters are in the computer), but how are they “spent” between various parts of the architecture - the attention heads vs feed-forward networks, for instance. And how can one calculate the number of parameters from the architecture’s “size hyperparameters” like dimensionality and number of layers?

The goal of this post is to answer those questions, and make sense of this nice table from the GPT-3 paper, deriving the n_params column from the other columns.


Book of the Week

Viva the Entrepreneur: Founding, Scaling, and Raising Venture Capital in Latin America


Do you have any more links our community should read? Feel free to post them on the comments.

Have a nice week. 😉


Have you read last week's post? Check the archive.

Share this post

Polymath Engineer Weekly #46

weekly.polymathengineer.dev
Share
Previous
Next
Comments
Top
New
Community

No posts

Ready for more?

© 2023 Polymath Engineer Weekly
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing