November 24, 2021
One of the things that excite me the most in technology currently is biotech. In particular, with proteins. Different proteins are the machinery of the human body, and as we understand them better, a lot of cool things can be done.
This is to a large extent driven by progress in data analysis.
Here's how I think about it. Imagine aliens visiting humans trying to understand society. After they've explored and tried to understand human society for a while, they might eventually figure out that humans use computers for a lot of things (please disregard that for aliens to reach earth, they'd probably need to have something similar to computers). But instead of trying to boot it up and figure out the password, they'd look inside and it apart. Here, they see that there is a collection of semiconductors, ones are zeroes, that are stored in a non-random way. They might get that this is something used to store and transform information. The problem they face now is that they would want to interpret the information and how we use it only from the ones and zeroes.
The way I see it, this is kind of the stage where we are with protein engineering. The difference is that we are not studying ones and zeros, but rather sequences of 20 amino acids that contain all the information necessary to account for protein function. However, it's hard to predict exactly which amino acid sequence that would map to which function. If one could figure that out, one could engineer proteins with a very specific function.
One way to do solve this problem is with machine learning, which is what Alphafold (by DeepMind) is doing. The idea is to train models on amino acids and their resulting function and from that predict functions from the amino acid sequences. Pretty exciting.