Tessl

View Original

Deep Dive on AI Documentation: Live Demo with Omer Rosenbaum

In this episode, Simon Maple sits down with Omer Rosenbaum, CTO and founder of Swimm, to explore how AI can transform the way developers handle code documentation. Learn about the challenges developers face, how Swimm's innovative AI tools address these issues, and see live demonstrations of Swimm in action.

Episode Description

Join Simon Maple as he interviews Omer Rosenbaum, the CTO and founder of Swimm, in a fascinating discussion on leveraging AI to tackle the perennial challenges of code documentation. Omer shares insights into how Swimm uses static analysis and generative AI to keep documentation in sync with code changes, making it easier for developers to find and maintain relevant documentation. Through practical demonstrations, Omer illustrates how Swimm can be integrated into your development workflow to enhance onboarding, training, and overall code comprehension. Whether you're a developer or a product manager, this episode provides valuable insights into the future of code documentation.

Resources

Chapters

  1. [00:00:20] Introduction - Simon Maple introduces the episode and guest Omer Rosenbaum, discussing the importance of AI in documentation.
  2. [00:01:42] What is Swimm? - Omer explains Swimm's capabilities and how it helps organizations understand their code.
  3. [00:02:06] Practical Demonstration Using Swimm - Omer walks through a live example of adding a new command to the Git CLI using Swimm.
  4. [00:03:16] Discovering Documentation Without Searching - Demonstration of Swimm's feature that presents relevant documentation contextually within the code.
  5. [00:07:50] Auto-Syncing Documentation with Code Changes - How Swimm automatically updates documentation to reflect changes in the codebase.
  6. [00:08:39] Integration with GitHub and CI/CD Pipelines - Discussion on Swimm's seamless integration with GitHub and CI/CD pipelines.
  7. [00:13:52] Guided Approach to Documentation - Omer shows how developers familiar with the code can use AI to create documentation efficiently.
  8. [00:16:23] Automatic Documentation for Unknown Codebases - Demonstration of Swimm generating documentation for a large COBOL repository using AI.
  9. [00:25:26] Balancing AI-Generated Documentation with Human Validation - The importance of human validation to ensure accuracy in AI-generated documentation.
  10. [00:31:51] How to Get Started with Swimm - Omer explains how listeners can try Swimm and take advantage of its free tier and enterprise solutions.

Full Script

[00:00:20] Simon Maple: Hello again, and welcome back to part two of this episode. joining me is Omer, Rosenbaum, who is the CTO and founder of Swimm. And Omer, we just had a wonderful discussion, really about how AI , can make a really strong impact into a really important problem around documentation, in a number of different ways.

[00:00:45] Simon Maple: First of all, how the awareness of the fact that documentation even exists or how it's connected and relevant to code a developer is intending to change or wanting to learn from that, that awareness is absolutely core. And then secondly, making sure that changes to that code,are kept in sync, so that there is no drift,

[00:01:07] Simon Maple: between code and documentation. A number of other topics that we talked about, but for me, those are some of the core areas that I guess pain a developer so much, today. And we're gonna, we're gonna walk through that, a little bit in, in this session. So we're gonna jump into some screen shares and you can show us that.

[00:01:26] Simon Maple: Now, Omer,you're obviously gonna be showing this using Swimm as the example, but we're gonna really talking through a lot of the problems and how that can really be helped with AI. Give us a brief intro into Swimm for those who, who perhaps missed the first episode or maybe need a refresher,

[00:01:40] Simon Maple: tell us a little bit about Swimm.

[00:01:42] Omer Rosenbaum: Sure. So at Swimm, we help organizations understand their own code, by relying on static analysis capabilities, as well as generative AI, we help with generating documentation automatically, finding it when you need it most, and keeping it up to date as the code changes to avoid drift.

[00:01:59] Simon Maple: Amazing. And let's jump straight into the screen share. In fact, I'll bring your screen up here.

[00:02:06] Omer Rosenbaum: Sure. I would actually like to start from the end, from when you actually do have documentation and get us all into the mind of a developer who is trying to introduce a change to a code base that they don't know.

[00:02:18] Omer Rosenbaum: And specifically I chose the code base of Git. I really like it. I also wrote a book about Git. So I know some of the code, but, we'll talk as if someone were to introduce their first pull request to git source code and specifically to add a new command to git CLI. So like you have git add and git commit and so on.

[00:02:37] Omer Rosenbaum: So maybe you want to add a git something else. And you're looking for

[00:02:42] Simon Maple: And Omer,I'll be honest. I haven't done C coding for 20 years since I was at university. So I really need some documentation to understand this.

[00:02:51] Omer Rosenbaum: Fair enough. Most people, need to understand, need some help to understand new code they see, right?

[00:02:56] Omer Rosenbaum: We tend to forget it when we get acquainted with the code base, how daunting it might seem at first glance. So here, if we consider my workflow as a developer, I don't think about documentation. I read the code first, right? And I did some searching and I found that there is this file, git.c which parses some things.

[00:03:16] Omer Rosenbaum: I don't really understand it all the way. And at some point, I get to this part with a struct called cmd_struct and I see the wave icon here. Now this wave icon tells me that someone has written a document, this someone could be a human or AI. Someone has written a document and mentioned this part of the code.

[00:03:37] Omer Rosenbaum: And specifically, if I hover here, I can see that it says that to make it aware of the command, it needs to be registered by adding it. And I see here that it's part of a document titled adding a command. If I click on it, it would open the document on the right hand side. Now, the important thing to notice here is that I found the document without searching for it, right?

[00:03:58] Omer Rosenbaum: I was browsing through the code, and then I saw, hey, someone documented this. When I clicked, it opened up the document. I'll zoom out a bit so we can easily see, but this snippet here is actually those lines from the code. So let me explain a bit how Swimm documents look like. So Swimm documents consist of text and also three main elements that are coupled to codebase.

[00:04:23] Omer Rosenbaum: One is a snippet. So you can see this file, sorry, this document walks you through multiple files. It talks about how you add a command and it says that when you add a new command, you have to create a new file under the my_builtin folder and you need to update builtin.h and also git.c. If you click on one of those snippets, it would also take you to the right part in the code so you can see what I'm talking about here.

[00:04:49] Omer Rosenbaum: So a snippet is a few lines of code from somewhere within the repository. Another element is a token. A token is a part of a line. For example, add here is not just the string add, it's The token add from line 485, this one here. Third is a path, like a name of a file.

[00:05:09] Omer Rosenbaum: It could also be a name of a folder. Now by tracking and coupling, we achieve a few things. One is that as a reader of the documentation, if I read this and I try to understand what are you talking about when you say add and I click, It tells me where it is. If I click here, it will take me to the right place in the code,

[00:05:25] Omer Rosenbaum: so it's easier to write or understand what this is about. Another big advantage we get is that we can keep this up to date because we know what part of the code we're documented here. So for example, say now I'm a developer and I'm changing something here and maybe I changed the name of this tribe.

[00:05:44] Omer Rosenbaum: Maybe I added a comment here so the line number would change. So some comment. And maybe I changed here from commands to my my_commands. And I click on save here. If I go back here to the document within Swimm, I would see that it shows me that the code changed, and it showed me the before and after. So it'd say, hey, look, what used to be commands in line 484 is now my commands on line 486.

[00:06:14] Omer Rosenbaum: And also the snippet changed, right, commands turned to my_commands. So this changed. And these kind of changes, We call auto syncable, that is, we automatically sync them with code and you can configure Swimm to automatically click on accept here and accept that as this is now the source of truth. Because it's a subtle change.

[00:06:34] Omer Rosenbaum: We might find this change and you would want to add some information. I add another flag, say the Tessl flag here. And now you might want to explain, that this flag is useful for some cases where you want to add it. So I show you that this is the part that actually changed. And now you might want to add this to the documentation and say, okay, notice the use of Tessl flag here and explain why we do that.

[00:07:05] Omer Rosenbaum: Notice that this is not linked to code, but you see this dashed underline that tells me, Oh, you probably wanted Tessl flag from this line, right? Yes. Now, it's coupled to the code base, from now on, I would know that this is the Tessl flag I'm talking about, and of course, I'll make sure it remains up to date as the code evolves.

[00:07:28] Omer Rosenbaum: when you change the code, there are two ways to know it affects documentation. You can either add a git hook that would tell you something is outdated, and then within the Swimm plugin, You can see what, which doc is outdated. You can also see from the icon as before that here, there is some document.

[00:07:50] Omer Rosenbaum: It doesn't appear now because it's outdated, but before that you see that there is an icon telling you something. There is some document that corresponds to this code. So if you change it, you don't need to check it. And as we mentioned before, we also integrate to the CI and the PR process. So if we look at the same code base.

[00:08:08] Omer Rosenbaum: In GitHub now and say I introduced a change within GitHub this change made one document outdated So it says there are docs that need your attention, this Swimm doc. And there is also a check that fails because not all documents are up to date. If I click here, I go to Swim's web application, which looks similar to what we saw in the IDE, but it's from the web application, and I see here the elements that need my attention.

[00:08:39] Omer Rosenbaum: So one is this. So you see here what used to be git.c is now main.c, so in this pull request I actually renamed the file, and I also changed it from commands to subcommands, and I could accept. And also, it's this, a very similar experience to what you have in the IDE, but you get it from the web application as part of your pull request process.

[00:09:01] Omer Rosenbaum: And now, if you want to add something, as before explained, you can do it here, it's also editable. Click accept. And at the end, you would commit it to the code base. All of the documents are Markdown files that are stored in the code base, by the way, alongside the code.

[00:09:19] Simon Maple: So if we go back to the IDE just for a second, I think there's a few workflows here that are really interesting.

[00:09:25] Simon Maple: So the first one that kind of struck me is,before we made changes there and the,the icon was there. It really allows me as a, there it is, that kind of an icon. It really,for a developer who is looking at this code base from scratch, it's really nice to be able to get that information to say, here is something that can actually help you, not just understand the code, But if you were thinking about the workflows and how to change that code, it gives you that greater understanding into that.

[00:09:55] Simon Maple: But I think when, if a developer is coming in here with the intention to make changes to code. It's super, super important that they recognize these as areas, points at which there could be drift. And it highlights the fact that, okay, not only should you potentially read this if you are unfamiliar with this code, so that you know how to better make that change, but to safeguard future developers and users of this code, make sure you document any code changes so that they see that, that up to date version of this documentation.

[00:10:32] Simon Maple: I think that first piece is super, super valuable, which avoids that drift, like we mentioned in that first episode. So now I can see why, as you said, most, Swimm users prefer to do it in the IDE because it's almost like they're almost like validating the changes that are occurring and I think it's almost that much nicer to be able to do it as you go and accept in the IDE.

[00:10:55] Simon Maple: That's I think you mentioned that's the way that most of your users, use this documentation.

[00:11:00] Omer Rosenbaum: Correct. Most of them. And I think also when they find documentation that helps them when they need it most, that's also like a wow effect, right? And the extreme cases, like someone wrote here why they made the change and then you read it and you're like, Oh my God, I was supposed, I was just going to change this.

[00:11:16] Omer Rosenbaum: And apparently, it had been done like so in the past, but someone fixed it and he explains here why it's not a good idea. And it's saving me so much time, right? And then you got a wow effect that you only get if you find documents when you need them, right? No, absolutely. Just to add to what you said before, also the fact that we can put this icon here stems from the fact that we keep this link up to date.

[00:11:41] Omer Rosenbaum: Otherwise you would see an icon on the wrong snippet or on something that is outdated, right? And then it's more harmful than that.

[00:11:48] Simon Maple: Yeah. Yeah. So as you, how do you connect, this version of code? to the, the version of the documentation in, in Swimm. Do you have a similar kind of Git style connection?

[00:12:03] Simon Maple: So if I was perhaps to get a different, a different branch or a different version of the same code base, how do you keep that aligned?

[00:12:11] Omer Rosenbaum: So since the Swimm documents are markdowns stored within the repository in a folder called .swm, Then they are also version sourced, version controlled, like the rest of the code.

[00:12:24] Omer Rosenbaum: So if I change this, I commit, and now when you check out this commit or branch, you will see the corresponding version of the code in the document.

[00:12:33] Simon Maple: Cool. And one of the, one of the things that when you go across into the, onto the plugin on the right hand side, I guess there are two pieces here, which is very interesting.

[00:12:41] Simon Maple: The first piece, which is, a suggestion of what should be changed. And then the second piece, which is I want to add some text myself,to guide effectively,and provide a snippet of code with a snippet of my, my, my thoughts to, to generate that information. I guess the first question is without the guidance, how good is AI, and I guess, You use an amount of static analysis on the code as well to understand that the flows. How good are the suggestions without any information, whereby if we were only to rely on that, would it be right? Like 70 percent of the time, 80%?

[00:13:19] Omer Rosenbaum: So without any additional context, That you provide to AI, it would be 80 percent not what you want.

[00:13:26] Simon Maple: Oh, really? Wow. Okay.

[00:13:29] Omer Rosenbaum: And only when you provide additional context is it able to produce something meaningful. Otherwise, it could make a lot of guesses that are uneducated because, we didn't provide it with the right context. When we provide the right context, it does a pretty good job on it. I think 80 percent of the time is fair.

[00:13:43] Omer Rosenbaum: Yeah. But I think this also begs the question of how you get to write these documents in the first place, right? And where I think AI also shines. So I'll show that in a moment too.

[00:13:52] Simon Maple: Yeah. And that shows how much is still of the context is in the author or the developer's head when they're making those changes to actually get it accurate.

[00:14:00] Simon Maple: That's super interesting. So it's a, it's still a very much an assistant style, usage workflow. Yeah. And the value, for the PR, as we mentioned in the previous session as well, is that, of course, you can have many contributors,into a project and some, you can't force, a developer to use, in this case, Swimm in their IDE.

[00:14:20] Simon Maple: They might just go ahead and make changes and push, push some commits up,and send a pull request. So it allows that, that policy effect across your entire project to make sure it isn't just two of your 10 developers that are writing code with documentation changes.

[00:14:35] Simon Maple: And actually 80 percent of the team are allowing the documentation to go stale. It allows you to enforce, your, the parity between your documentation and your,and your code. Definitely. Yeah, okay, cool. yeah, we, interestingly, we looked at the, the Swimm documentation here.

[00:14:54] Simon Maple: how's that, I'm curious, how's that evolved over time? has it always looked like this or were there, with their various stages of learning in terms of how you got to this, this.

[00:15:03] Omer Rosenbaum: There are, there were many stages of learning. It even didn't start as a Markdown file. It started as a JSON file that was hard to read.

[00:15:12] Omer Rosenbaum: And it evolved, and was refactored many times so that the markdown can actually also be rendered without using Swimm. You don't get this experience of clicking and getting to the right file or all of the niceties that you get from this editor. But everything is rendered just fine on say VS Code preview or GitHub's preview.

[00:15:31] Simon Maple: Yeah. Okay, cool. And if you had existing documentation, how would you connect that? How would you couple that to the code?

[00:15:37] Omer Rosenbaum: So you could import it into Swimm, and since it's markdown, it actually works out of the box. But then you would go over it and add references to the code. If you remember from before, when we had some things that are not coupled, like here, this T here, it tells me, Hey, did you actually mean this T maybe here?

[00:15:54] Omer Rosenbaum: So this might be a silly example with just a letter, right? But if it were a name of a file or a variable from the code or something like that, we would say you probably want to couple it here. So we'd make those suggestions for you.

[00:16:07] Omer Rosenbaum: So I think now it would be interesting to see how you, go about creating these kind of documents. Yeah. And I want to show two, almost extreme cases. One is where I'm going to guide AI into writing a document in a way that I want it to be. . And the other would be, we'll look at a code base that I don't know.

[00:16:23] Omer Rosenbaum: And for at least for the sake of the story, no one knows. It's a COBOL repo. It's a very big COBOL repository that's open source. And how we generate documentation for this automatically. Starting from a more guided approach, let's say I want to create the same document as I did here, and I would start by going, let's even get it back to the original state.

[00:16:47] Omer Rosenbaum: I'll start by creating a document. Now when you think about creating a document, most people envision like a blank page, something like this, right? And let's write something, which is a very daunting state for most people. I don't like a blank screen. Most people don't. So instead, what I want to do is to focus on the code.

[00:17:07] Omer Rosenbaum: The code which, in this case, I know because I want to document it, and I want to show how I create a new git command. So I just say, okay, if I were to explain to Simon how I add a new git command, I would start by showing him this track, so I'll select it, right click, and Swimm, Add to New Dock. This adds it to a new document that would be created here.

[00:17:27] Simon Maple: And this creates that connection as well, so this is coupling the code to that doc immediately.

[00:17:35] Omer Rosenbaum: Correct, and now you would also see this icon here once I write something in here. So they're done. Then, I can go to definition and do anything that I would like. Let's say I want to go to builtin.h. Let's say I'll explain about, annotate this time, another command, so add to current doc, it would add this snippet. Now I'd also go to where it's implemented, so I'd go to annotate.c, and I'll add that, add to current doc. Cool, so now I have three snippets, and let's say that's the code flow that I want to say, or in this case the recurring pattern, and I want to say generate draft with AI.

[00:18:12] Omer Rosenbaum: So I'm going to give it a title, how to add a new command in git, and here I can add instructions if I want to. In this case, I want to say, explain how to add a new command using annotate as an example. And structure, let's say, introduction to process adding a new command.

[00:18:45] Omer Rosenbaum: What I want to get now, and as you'll see in a moment, is an educated version of a document. That did most of the heavy lifting for me in terms of actually, writing the text, explaining what it does, and also coupling it to code. So you see, for example, here it says declare the command function.

[00:19:03] Omer Rosenbaum: Next we'll declare the command function builtin.h. It includes this, element, this coupling to the code, implement the command logic, and so on. And it needs to be according to my instructions, right? So if I don't like it, I could revert and edit prompt. Now there might be something that I want to add that notice that maybe in this version, we do things differently because such and such, everything here is editable, right?

[00:19:28] Omer Rosenbaum: They could go here now and edit it as I like. I can also select this and tell AI, Hey, make it shorter, make it better, make it different, so I can use AI. But at some point I think when we want to extract our own knowledge, I'm guiding AI by this is the flow I want you to explain. These are the snippets I want you to relate to.

[00:19:47] Omer Rosenbaum: This is some general structure of the document. Generate what you can, and I'll add my unique knowledge on top of that, and that would be a great document.

[00:19:58] Simon Maple: Yeah. So what are the gaps here still, then? Is there still an,is there a gap whereby the developer needs to initiate this process?

[00:20:07] Simon Maple: I'm trying to, I'm trying to work out what are the

[00:20:10] Omer Rosenbaum: Of course. This specific process, assumes two things. One, you know the code and two, you decided to document it, right?

[00:20:19] Simon Maple: Yeah.

[00:20:19] Omer Rosenbaum: And I guess for some cases, this is the perfect answer. For me, I use this a lot. I implemented something and I want to share the knowledge about it with the rest of the team.

[00:20:29] Omer Rosenbaum: This is much easier for me than anything that is super automatic because I can guide it exactly as I want. If I were new to a codebase, it doesn't really help me as much, and I'll show how we approach that afterwards.

[00:20:40] Simon Maple: So the developer needs to understand, and this is almost like a developer training, other developers in the team or people who are completely outside of the team and unfamiliar with how to do this, How it should be done, but it massively speeds up the process of that and actually then can actually pinpoint and relate to pieces of code,

[00:20:59] Simon Maple: so that people can click through and actually see in their IDE, what to do, but I think then linking that then back to what we previously discussed as things change, this keeps up to date, which was, which is what makes this relevant, not just the day it was created, but later that afternoon and in the future as well, which is.

[00:21:17] Omer Rosenbaum: Exactly, and then you can also find it in the IDE when you need it.

[00:21:21] Omer Rosenbaum: So it makes sense to write it and also it's much easier because essentially you just selected the right snippets. Set a few instructions, and I have a document.

[00:21:32] Omer Rosenbaum: So this is the approach that we take when you do know the code, and want to write it. What happens when you don't?

[00:21:38] Omer Rosenbaum: Looking at another repository, that I'm gonna bring in now, this is a COBOL repository called Kello, you can find it on GitHub, and, uh, generate documents for it. So a user would click on generate, and then there is some analysis that happens once we perform the static analysis, understand all the relationships in the code base.

[00:21:59] Omer Rosenbaum: And then the first thing we do is suggest a list of modules that we think the documentation should be structured according to. So these are modules that we found in the repository. They could correspond to a single folder, maybe to some folders and files, some logical component.

[00:22:16] Simon Maple: And this is generated by us, or is this suggested by the AI?

[00:22:20] Omer Rosenbaum: Suggested by Swimm.

[00:22:22] Simon Maple: By Swimm, so a mixture of them.

[00:22:24] Omer Rosenbaum: Using our deterministic version, with AI helps refining the names here, but finding and separating the different components is done deterministically. And we have a few different heuristics, so if you don't like this, you can re suggest, you get another suggestion.

[00:22:38] Omer Rosenbaum: And then, if you don't know the code at all, you would probably pick one, and say next, and just generate everything for me. But if you do know the code and you click next, we tell you, okay, we think now the VIP section should actually be separated into reports, transactions, management. You can say, no, I, you know what?

[00:22:54] Omer Rosenbaum: I don't want transactions, but maybe I want something else and add to it. So you can change what the structure looks like. Then when you click on generate, we generate, it could be dozens, hundreds or thousands of documents, depending on the size of the repository, starting with, this document here, the overview document, which basically in this case, it took the first suggestion that we saw before.

[00:23:21] Omer Rosenbaum: So this one. And the overview document explains what this repo is about, what the interactions are between the different components, and then for each component, okay, what is it about in general, a paragraph, and a few links to other documents we generated about this, which could be an overview of this logical component, and it could be a full flow.

[00:23:44] Omer Rosenbaum: So this is, the main document. And again, if we're talking about a codebase nobody knows, I think anyone who gets into any task would appreciate a high level overview.

[00:23:53] Simon Maple: Yeah,

[00:23:53] Omer Rosenbaum: then let's say I know I want to change something specific in a specific place. We can look, so this is COBOL. So to those of you who haven't had the pleasure of programming COBOL, reading COBOL code is mainly split into programs.

[00:24:08] Omer Rosenbaum: So this is a type of a document we have only for COBOL. It doesn't really make sense to document a Python program, what is a Python program, right? In COBOL it does. So we also document a program and we explain the flow of the program. So this is what the program does visually. And this is the Swimm document.

[00:24:26] Omer Rosenbaum: The diagram also keeps up to date, by the way, because these elements are coupled to code. And then we go step by step and explain what every part does. Again, you click, you get to the relevant part of the code. Let me get some more space here. it remains up to date if you change the code and everything we saw from before, just that this document was generated by AI explaining the process of, say, this program.

[00:24:51] Simon Maple: I guess the risk at this stage is , there's almost like a far greater reliance on the generation here being accurate compared to that previous instance whereby the developer effectively knows that code. In that case, it's going to be very hard and very time consuming for developers to perform that validation effectively.

[00:25:14] Omer Rosenbaum: Especially if they don't know the code.

[00:25:16] Simon Maple: Exactly, yeah. how much of a, value versus, inefficiency of the documentation potentially being wrong? How much of a balance is there in terms of this?

[00:25:26] Omer Rosenbaum: It's a good question. First of all, we always have, this disclaimer. This is auto generated by Swimm AI.

[00:25:33] Omer Rosenbaum: It hasn't been verified by a human. And it's important to know, right? It could be. We take lots and lots of precautions, to writing things that we're more confident about and people who read this know it was generated by AI, know it's supposed to help them. We don't expect any organization to go through all the documents and validate them.

[00:25:53] Omer Rosenbaum: I don't think it makes sense. I think when a developer or product manager, talk a bit about that in a moment. Say a developer wants to change some part of the code, they find the relevant documentation, when they look through the code as before, they read the documentation, they see that it helps them, and if they want to add or change something, they do it on the fly.

[00:26:11] Omer Rosenbaum: So it's not like a major effort. However, of course, if we, on the extreme, if we generate documentation that is all incorrect, it could be just misleading and harmful even. We put a lot of effort into getting the right context for AI and then validating the output before showing it to users.

[00:26:30] Simon Maple: Yeah.

[00:26:30] Simon Maple: Yeah, so I'm guessing that as a user of something like this, where it's a completely new code base, of course, gives me an amazing first step in, in terms of actually understanding the structure very quickly. I can then go to a relevant part, and then I guess as I look as I want to learn more, let's say I do want to make a change to a legacy piece of code because it's an urgent security or critical bug change that I need to fix.

[00:26:51] Simon Maple: I can at least start with that. And then as I actually start reading through the documentation, the various snippets. I can effectively validate as I read the pieces I want, but it's important, for the developer to recognize I need to validate. It wasn't written with another human involved who knew this code.

[00:27:13] Simon Maple: So the validation almost occurs at two separate points. Previously it was as the documentation was being written. Here, it's probably more,when that documentation needs to be used in anger, because there's an issue that they need to, that they need to fix.

[00:27:28] Omer Rosenbaum: Correct. Yeah, definitely.

[00:27:30] Omer Rosenbaum: And also as you, as they browse the code, they can see those

[00:27:34] Omer Rosenbaum: snippets of information, right, on top of the code that tells something that we're like, okay, so I read the code, I read the code, now this is maybe a bit more complex. Maybe I'll click here and I'll find that the doc might be helpful. So this is indeed the process. Now, I mentioned before product managers, as we saw, especially for large organizations, sometimes, you want to change something in your legacy system because it starts from a business need.

[00:27:59] Omer Rosenbaum: And then some product manager needs to decide what needs to change and they need to talk with engineering. And pretty often it happens that no one knows where this even exists. So what we did for that, we generate different versions of the same document. In this example, this is a document that is.

[00:28:17] Omer Rosenbaum: Mostly made for, product managers, business people who want to understand something at a very high level. So it's just described, there are no code snippets here, right? This section is responsible for evaluating and critiquing different fields. For example, it can critique contracts, albums, sequences, and more.

[00:28:32] Omer Rosenbaum: Okay, so if I want to change something here, I can tell the developer, you probably want to look here. Because I understand what it does, from a business perspective. If you click here, it will take you to the more technical document. It's the same flow, but this time we say, okay, this is the name of this program and it's called this section,

[00:28:49] Omer Rosenbaum: and then it goes this section. And now we zoom into a part of the flow, we'll explain what each part does, what sections it calls and so on. So I think when you have the knowledge about the code base, you can then reflect it, in different versions, in different language, in different depth levels for different personas and use cases.

[00:29:10] Simon Maple: And it's interesting actually, because this leads us back to that original conversation, which I think we actually had in part one of this episode, which is the importance to mix AI and a static analysis piece, to actually not just describe, but truly understand the flows. and when I look at these documents, I see various kind of, zoomed out versions of an AST style graph whereby, you effectively under, you effectively map out the flows between, methods, calls, objects and whatever it is, box them up.

[00:29:46] Simon Maple: And then all of a sudden you get, the bigger, workflows, the bigger modules, the programs and so forth and all the way through to the business side. So this isn't ChatGPT just going, this looks like that. So let's put it in a box called this. it's a true static analysis at its core, which is creating that AST model, I'm guessing,

[00:30:04] Simon Maple: and that's what, that's how these graphs are built out. Yes. And then we build the

[00:30:09] Omer Rosenbaum: knowledge bottom up. So in terms of, A language such as COBOL, where you might have variable names that don't mean much. We look at all the usages, use a static analysis, we try to understand what this does. Here we might rely on LLM to give a textual description.

[00:30:24] Omer Rosenbaum: Then we rely on that when we look at that section that uses this variable. Now, we talked about the process. We started with a way more LLM based approach, or an agentic approach. okay, this is the code. Tell us what you need and understand. And it just went. we tried for a long time with many iterations and it just found flows that started in one place, then jumped to another, not understanding it's a completely different flow.

[00:30:47] Omer Rosenbaum: And it's, it's a really hard task to really understand all code flows and then say, to explain or to rank them, which ones are the most important to document here. The overview document to generate that you need to understand what's not important to include here, which is a very hard task, right?

[00:31:07] Omer Rosenbaum: So you need to map out everything, then summarize to pieces, then decide which ones are important here. And there is some hierarchy even, like transactions between branch operations. So to generate that, you really need this bottom up approach. And I think the combination of static analysis with AI to generate a really nice looking document in the end, in coherent English, makes a lot of sense.

[00:31:31] Simon Maple: Amazing. This is super interesting technology and I'm sure,listeners and watchers of the video here can see already and probably already thinking how they can actually apply this to their existing code bases. How would you suggest people, if people wanted to try Swimm out, what was the best ways of folks getting started?

[00:31:51] Omer Rosenbaum: So we have a free tier, but I encourage everyone listening to this, to reach out to us. We're happy to walk you through and understand your specific use cases. If you're working in an enterprise, we really like to understand your specific use case and make sure we generate documents that, adhere to your conventions to what you find most valuable, right?

[00:32:14] Omer Rosenbaum: Because you can generate lots of different docs based on the same knowledge of the code. So we really appreciate having these conversations and generating documents that are customized to your repositories or organization needs.

[00:32:27] Simon Maple: Awesome, and the website swimm.io with the double M, right?

[00:32:32] Simon Maple: Correct. S W I M M . I O. Perfect. Amazing. This has been very insightful, very interesting to, to see it in action and,and great to see the workflows and how valuable they will be in, in sitting in existing developer workflows, certainly. Thanks very much, Omer,for sharing, your knowledge with us here.

[00:32:50] Simon Maple: And, and yeah, thanks everyone for listening to another episode and tuning in. See you next time.

[00:32:57] Omer Rosenbaum: Thank you, Simon.

Podcast theme music by Transistor.fm. Learn how to start a podcast here.