Organic chemistry is nothing but pattern recognition!?
“This is nothing but pattern recognition!” As unsettling as that might sound to many organic chemistry practitioners, it’s a comment one of my former supervisors made about (undergraduate) organic chemistry. Years later, this remark resurfaced as I dived into the world of machine learning.
I vividly remember a brilliant college classmate’s strong interest in neuroscience. Being more physics- and math-minded, I didn’t quite grasp it then. It wasn’t until recently that I realized how the process of receiving information, forming memories, or making decisions could be mathematically modeled as communications between neural networks. (Huge thanks to Fei-Fei Li’s inspiring memoir, The Worlds I See—highly recommended!)
The basic unit (algorithm) in these neural networks, aptly named a perceptron, takes input (electrical) signals and performs binary classifications. It mimics what an actual neuron does: receiving signals, accumulating and modulating them, and firing an output signal only when the total strength of inputs exceeds a certain threshold. While complex cognitive activities involve large neural networks, at the most fundamental level, whether it’s a single neuron or a single perceptron, the primary task is classification.
Classification is indeed how we learn about the world. As babies, we learn to identify a cat versus a toy, or what’s edible versus inedible. As we grow older, we categorize words into nouns and verbs and recognize facial and vocal expressions that signify emotions like happiness or anger. We make sense of the world by organizing and classifying information, which is essential for understanding, decision-making, and communication.
Humans are amazingly good at classification. It allows us to identify risks in a split second and avoid accidents; it even enables us to sense our surroundings without direct attention. However, this ability is so deeply ingrained that it can lead to unconscious biases, such as distinguishing people who look different from us and isolating us from embracing diversity.
Returning to chemistry: recognizing that our learning and cognitive processes begin with classification makes the idea that “organic chemistry is nothing but pattern recognition” seem less outrageous. However, going from classification to true pattern recognition involves additional tasks like feature identification/extraction and regression. These might include understanding/evaluating electrophilicity strength, reaction site crowdedness, temperature values, or environmental polarity. Admittedly, these subtleties may not be fully grasped even by most seasoned chemists, but they can be effectively learned by machines given suitable data sets, as evidenced by the growing number of publications on the application of artificial intelligence in synthetic organic chemistry.
To conclude, let me quote Flynn and Ogilvie from their article in the Journal of Chemical Education: “Chemical reactions follow patterns, and these patterns can allow a chemist to predict how a chemical will behave, even if they have never seen a particular reaction before.” Since we’re not ditching learning, teaching, or using electron-pushing formalism in our minds anytime soon, we might improve our approach to thinking about and practicing organic chemistry by comparing methods that fundamentally make machines “smarter” in synthesis.