Fast forward two years and a bit, and the lab has published the molecule's synthesis (Download paper here). That means it's my turn to write another blog post, on topics (at least vaguely) related to that paper.
A summary of the paper if you feel the manuscript is tl;dr, or you're behind a paywall:
- Tryptorubin A is a neat indole alkaloid. It's isolated from an interesting ternary symbiosis: Leaf-cutter ants pile up leaves to compost into mulch. Then, the mulch they have generated goes into little mushroom farms, where the ants grow mushrooms to eat. This is their main diet. Meanwhile, on top these farmed mushrooms live a variety of defense bacteria. One of those bacteria, a Streptomyces sp., generates tryptorubin A as a metabolite. From these bacteria Clardy et al. isolated the molecule and published its fascinating structure (link to isolation).
- Actually, none of this is in the paper but it's just cool trivia about the molecule.
- Although nothing is known about the bioactivity or biological role of tryptorubin A, we pursued its synthesis strictly for reasons of structural interest. This is a somewhat old-school, very academic way of thinking. The project was incepted as follows:
- <scrolling through JACS ASAPS>, "Hey, that's an interesting structure. I bet it would be hard to make."
- Me, to Phil: "This molecule would be hard to make."
- Phil, to me: "Great. The best projects are ones where people are slightly scared at the start. Have fun."*
- Me: "Damnit, now I have to make it."
- Off I set to make it. The chemistry is pretty self-explanatory (tons of peptide couplings, an Ulmann and a Friedel-crafts), so I won't go into much detail there. However, I will note that in the SI, we briefly highlight a dozen or so failed strategies (SI pages 8-10), if you're curious to see some early things we tried.
- The extremely short summary of chemical lessons learned is (a) don't try to close highly strained macrocycles with macrolactamizations, if possible, and (b) KISS. Fancy ring contraction or other trickery never beat out simple, straightforward old school robust chemistry.
- I made what I thought was the natural product <9 months into the project. This is where stuff got interesting.
- None of the NMR data matched the real natural product.
- However, the 2D data unambiguously confirmed that we had the correct atomic connectivity and stereochemistry.
- What the hell was going on?
- At this point, we reached out to team Clardy and got all the original NMR data. I spent about 2 weeks simply staring at NMRs trying to understand what could possibly be the difference between what I had made and the natural product.
- This was, by far, both the most challenging and the most rewarding couple weeks in my graduate career.
- I don't have many photos relating to this project, but I have this one, which summarizes the era nicely. This was taken at about 4AM. Coffee in one hand, beer (unopened?) in the other. Cumulative sleep in the preceding 72 hours? Approximately 6 hours total. Structural assignment is great. And also terrible.
- Finally, we hit the conclusion. Both natural tryptorubin A (compound 1a in the published manuscript) and what we had made (compound 1b) had identical connectivity and point chirality, but were inside-out with respect to one another. Formally, this renders them 'non-canonical atropisomers'**
- We spent a lot of time thinking about how to communicate this isomerism most clearly, and our best efforts are represented with cartoon drawings in the paper, reproduced here:
- We of course still had to make the natural product. Luckily, a crystal structure (compound 7) in the paper gave us a big hint: The indoline's point chirality enforced a pro-atropisomeric conformation that only allowed the bicycle to close in one way. We're not the first to use point chirality to relay to atrop-control, but to our knowledge this is the first control of such a "right-side-in vs. inside-out" molecular shape in a natural product.
- We made the molecule, and everyone was happy. Well, I would be happy, except I haven't gotten my fricking molecule cake. Phil, I know you're reading this. Everyone who finishes a molecule gets a cake. Where's my cake? Cake please.
- Given the interesting structural nuances of this story, we desperately attempted to get solid-state structures of the respective isomers. I launched a collaboration with Hosea Nelson @ UCLA to try to do MicroED, and also with James Nowick @ UCI to try high-throughput crystallography screening, but unfortunately, never got any structures. So, we're still reliant on NMR for all conclusions.
- In parallel with finishing the synthesis, we were also lucky to tie in with the Clardy group who was further elucidating the biosynthesis of this molecule. That's an interesting story in itself, but not mine to tell -- the extremely short version is this molecule is a RiPP, not an NRPS as originally believed, and there is some very cool CYP machinery to close the necessary macrocycles. More to come on this side of the story, I hope.
Fundamentally, this is a synthesis paper that's not about synthesis -- it's about understanding molecular conformation and supramolecular isomerism, and about how we can control said isomerism in a flask. We hope it's an enjoyable read.
I want to close out by thanking the incredible team I was lucky to work with throughout this process, most notably Yang Gao, who is an all around killer chemist. Thanks also to Jon Clardy and his talented postdocs Allison, Eric, and Emily for being a pleasure to collaborate with. And Phil, of course, is an amazing boss to work with. Even when he forgets to buy cake.
*TANGENTIAL SIDEBAR #1: "Why the heck do we still do total synthesis in the modern era?"
I want to start this sidebar with the statement that opinions here (and throughout this blog post) are strictly my own and may not represent Phil or the lab's overall perspective. For Phil's opinion on the utility of total synthesis in a modern context, it's worth reading his recent perspective.
With that note, here is my hot take on total synthesis in the abstract. My viewpoint is that the goal of a total synthesis must be considered with intentionality, and that the total synthesis, once completed, should be scored against that intentionally-designed goal. In my view, any of the following are perfectly reasonable drivers of a synthesis:
- Scalable access to material is needed
- SAR of unnatural analogs needs to be probed
- The robustness of a method for a key step is being tested, and the synthetic target is simply an arbitrary vehicle on which to test a new methodology
- Drivers for new methods are desired, and the total synthesis is being pursued to illuminate "which bonds are difficult to make with existing methods" with the explicit plan of spinning out new methods projects
- Blind faith that with structural complexity comes chemical insight, and that some serendipitous discovery occurs
- Sheer aesthetic work (directly analogous to a concert violinist's profession; creating beauty for the sake of creating beauty)
I feel that synthesis done under any of these umbrellas is useful. The problems start to arise when the declared goal is not actually enabled in the synthesis. It drives me up a wall when people introduce synthesis as 'a tool to make scalable amounts of material' and then make <1 mg. Because, at that point, you're not actually progressing solutions to the problem you're trying to solve. Likewise, if someone has a declared goal of exploring SAR, and subsequently make a single analog, it amounts to snakeoil salesmanship. These kinds of farces are where the field bears risk for stagnation.
So, I encourage everyone considering a synthesis to spend some deep introspection asking, "Why should I make this molecule," and subsequently throughout the project enforcing that all strategic decisions are well-aligned with the answer to that 'why.'
One issue that is worth discussing is the complexity around naming the isomeric relationship between 1a and 1b. The easy layman's explanation that 'one is inside-out relative to the other' is somewhat difficult to translate into formal language. Originally (i.e., before the peer review process), we referred to these compounds as topoisomers. If you poke around for the formal definition of topoisomer, the two definitions that pop up are "two compounds with identical connectivity but non-identical molecular graphs" and "two compounds with identical connectivity and point chirality, but that can only be interconverted via scission and reformation of bonds." Now, according to the former definition, 1a and 1b are definitely not topoisomers. According to the latter definition, things begin to get a little hairy, because if one ignores the physical limitations of bond lengths, 1a and 1b can interconvert (see Figure 4), but it's a very unphysical transformation.
However, a very thoughtful reviewer (thanks!) was kind enough to point out that physicality should be ignored in the topoisomer definition, and thus we can't think of 1a and 1b as topoisomers. For this reason, they should actually be called atropisomers. This risks swinging too far in the other direction, though: 1a and 1b's isomerism ('inside out') is categorically different from a prototypical atropisomer (e.g., binaphthyl) -- tryptorubin's isomerism has way more going on than simple sigma bond torsion.
To give a sense of just how confusing all this nomenclature can be, we decided to run a small social experiment. Phil tweeted a poll about lasso peptide isomerism:
Now, before we view the results of this poll, note that lasso peptides are just like tryptorubin: They are formally topologically trivial, but have a very defined set of shape-based isomers.
Go to twitter to read some of the fascinating commentary, but the short version is this: Even Twitter experts (people with the doctoral degree TweeHD?) have diverse and contradictory views on what this isomerism should be called.
Combining all of this, we coined the term 'non-canonical atropisomerism' to encapsulate tryptorubin's (and, for that matter, lasso peptides') shape-based isomerism. For me, the upshot of the entire nomenclature story is a classic 'a rose by any other name would smell as sweet' -- what we call the isomerism basically doesn't matter; the important part is that it's communicated clearly, and hopefully the manuscript (and especially structural graphics) make everything comprehensible.
Very cool. Cake well deserved, Sol et al!
ReplyDeleteHow about calling them conformational atropisomers (as opposed to rotational atropisomers)?
ReplyDeleteThis story raises several interesting questions:
1. How many other compounds have undiscovered conformational atropisomerism?
2. Are there any cases where a structure has been believed to be incorrect because the synthetic product had different properties but is actually a different atropisomer?
3. Are there any other undiscovered forms of isomerism?
Hi,
DeleteWe thought about something along the lines of 'conformational atropisomers' but ultimately decided against a term like that because, to our ear, that didn't sufficiently capture the impossibility of interconversion. Conformers, by definition, interconvert at some rate, and an important part of the compound 1a/1b pair is that they could never, within the laws of physics, possibly interconvert. Same with lasso peptides.
To your questions:
1. We agree that there's very likely other compounds with this type of isomerism. We haven't found any yet, but probably a good starting point is this review: https://pubs.rsc.org/en/content/articlelanding/2013/np/c2np20085f
We'll certainly keep our eyes peeled for this isomerism in the future (and hope isolationists will, too, in their assignments!)
2. To our knowledge this hasn't happened, but it certainly could. One interesting thing to think about on this topic is the converse: If we had (by coincidence) from the begining pursued a route that happened to make 1a (i.e., done late-stage indoline oxidation), we would have completely missed the isomerism present in tryptorubin. It's only because we accidentally made the wrong one that we discovered the isomerism. Perhaps other synthetic compounds have been made, where this type of isomerism is at work, but it was totally missed because the 'correct' one was made first.
3. I think there likely are, but probably most are in larger systems. In macromolecules, there are more degrees of freedom that may lead to more types of isomerism. For one example, see the fascinating isomerism seen in: 10.1038/s41557-018-0043-6
Best,
Sol
I amend my suggestion to configurational atropisomers.
DeleteNote that the examples in the paper for a catenane & rotaxane don't actually show isomerism since if separated they are no longer a single molecule & therefore no longer have the same formula. To show true atropisomerism the catenane would need a bigger ring that the other part could pass thru either once or twice & the rotaxane would need a longer core that the ring could have at least 2 positions on (both of which might create really weird chirality or isomerism). Some other odd possibilities that occurred to me are linked helicenes (topologically it definitely isn't a catenane but it sure looks like & should behave like one), 3-bladed rotational atropisomerism (i.e., attach 3 benzene rings to bicyclooctane; link it to anthracene & you get a strangely strained structure), and standard rotational atropisomerism but with a B-N bond instead of C-C between the parts (which might interconvert by dissociation instead of rotation - by the way, if the 2 central carbons in anthracene were replaced with B & N would the result act like molecular Lego?). Any idea if anyone has tried these? Seems like I could go on forever...
On further thought configurational atropisomers might be better suited for all topologically identical atropisomers. Third try: evertional atropisomer.
DeleteA few other comments about the paper:
In Fig. 3B shouldn't the line go from H to Ala instead of Tyr?
Fig. 4 does an excellent job of showing the relationship between the 2 isomers. It looks like they could also (theoretically) interconvert by rotating the bonds in the opposite direction so the Ile4-Trp5-Tyr6 portion passes outside rather than inside the ring, which offhand seems less impossible.
Close inspection of the unannotated region revealed a ribosomal binding site followed by a transcriptional start site - the ribosomal binding site must be in the transcript so it needs to be after the transcriptional start site (and before the translational start site).
Because we were not able to knock out the putative trp BGC - trp is universally used in bacterial genetics for the tryptophan biosynthesis genes; it would be really good to use a different symbol.
Hi,
ReplyDeleteAll of these alternate names for the isomerism are totally reasonable -- in some sense, as I talk about above in the blog, the name of the isomerism is fundamentally arbitrary as long as it is differentiated. We happened to pick non-canonical, so will stick with that. (And try to clarify as much as possible via figures)
As to your other comments:
With respect to fig3B and the biosynthesis study comments, I've forwarded your questions to our biosynthetic collaborators and so hopefully they'll comment shortly.
With respect to Figure 4, you're right about this alternate theoretical interconversion -- I would describe what you're saying as Tyr3 passing inside the macrocycle as Ile4/Trp5 swings past it. However, if you build a plastic model, it's instantly apparent that this interconversion is even crazier than the one currently drawn in Fig4. Remember, the red line in figure4 between Tyr3/Ile4 represents a single amide bond -- it is drawn disproportionately long in this figure for clarity. So, do the interconversion you're talking about, you're essentially asking an amide to stretch long enough to bend around the entire length of the Tyr's benzylic methylene+arene.
The conclusion from that thought experiment is identical to the current fig4's conclusion -- from a math topology perspective, the compounds are identical (topologically trivial); however, interconversion is crazy to even consider with the laws of physics.
Hope that clarifies,
Sol
Hi,
ReplyDeleteWith respect to Figure 3B, you're right in the most literal sense -- the leader's terminal histidine attaches into the precursor peptide's alanine. The graphic in the bottom of Fig. 3B is supposed to represent in a 'cartoon fashion' that the leader and tryptorubin precursor are attached, rather than signifying the actual point-of-attachment. For the actual primary-structure-level order of attachment, see the one-letter codes in the top of Fig. 3B, as well as more info in the paper's SI.
With respect to the order of the genetic elements, see figure S17 in the SI -- this should clarify things, with its schematic overview of the genetic elements.
With respect to naming the biosynthetic gene cluster 'trp', we didn't expect confusion with the tryptophan operon because the context of use is so different. Secondary metabolite biosynthetic gene clusters are normally abbreviated in a three letter code that alludes to the name of the corresponding metabolite. We are sorry for any confusion that resulted from our naming, and will certainly consider this in future publications.
Cheers,
Eric
Good job! I guess, it took a couple thousands of experiments to prepare both atropisomers for comparison. It can be quite stressful to manage this large amount of information, especially in a project, which spans several years. Do you use traditional paper lab notebooks? Or you switched to some electronic system?
ReplyDelete