When using AI coding assistants to refactor symbols across large codebases (6k+ files), developers face a binary choice: precision (LSP-based tools) or efficiency (grep/ripgrep). Shebe attempts to address this trade-off by way of a good old BM25 index, which is surprisingly fast and efficient.
shebe asks the simple question: "where does this symbol appear as text?". For C++ codebases that heavily use templates and macros, shebe will struggle. But I'm curious how it would actually perform, so I'm currently performing a search on https://gitlab.com/libeigen/eigen. Will report the results shortly.
When using AI coding assistants to refactor symbols across large codebases (6k+ files), developers face a binary choice: precision (LSP-based tools) or efficiency (grep/ripgrep). Shebe attempts to address this trade-off by way of a good old BM25 index, which is surprisingly fast and efficient.
How well does this approach work with C++ source code - which is notoriously difficult to parse, given context-dependent semantics?
Turned this into a science experiment and designed a test/workflow to rename the symbol MatrixXd -> MatrixPd in eigen and the results are promising at first glance. See https://github.com/rhobimd-oss/shebe/blob/main/WHY_SHEBE.md#...
shebe asks the simple question: "where does this symbol appear as text?". For C++ codebases that heavily use templates and macros, shebe will struggle. But I'm curious how it would actually perform, so I'm currently performing a search on https://gitlab.com/libeigen/eigen. Will report the results shortly.