Comments
-
anonymous
Actually, it looks like I found algorithm with complexity O(s^2*p), where s and p are lengths of string and pattern respectively. Is it OK if I'll implement it?
-
Greg Hurrell
Early versions of Command-T had a linear matching/scoring algorithm which was much faster.
The current matching algorithm provides much more intuitive ordering of the search results, but it approaches quadratic runtime in the worst case.
As it is written in C, even this quadratic runtime is fast enough for small-ish projects (of the order of 10k or 20k files). I'd love to find a faster algorithm that still provides scoring as good as or close to the quadratic one, or failing that, have some kind of self-tuning behavior which switches to a cheaper strategy as the search space crosses some threshold.
-
dzhioev
The problem is that current algorithm is not quadratic at all. You can ensure on my example: 1) Create empty dir. 2) Create file "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" in it. 3) Run vim in this dir and start Command-T. 4) Type several "a"-s in search string. With every 'a' search will greatly slow down. In fact you can't wait till it ends after ~10 'a'-s. In this case complexity of algorithm will be O(s!/(p! * (s - p)!)) (it's combinations number http://en.wikipedia.org/wiki/Combination ). It happens because algorithm memorize nothing during it's work.
-
dzhioev
BTW, Is there any way to subscribe for updates to this thread?
-
Greg Hurrell
I've got a patch which adds memoization; will push once I've done some more testing. It leads to a noticeable improvement using the "pathological" test case that you describe above.
Thanks for bringing this issue up. It had totally escaped me that the way the recursion was implemented was leading to unnecessary re-work in these cases.
(To answer you question about subscribing, there is an Atom feed; no email notifications yet, though.)
-
Greg Hurrell
I'm going to close this ticket now as the memoization that's been merged seems to be working fairly well (since then, have also added threading to map-reduce the problem). I did some testing to see how close the recursive algorithm is to a non-recursive algorithm, or one in which the recursion depth is limited, and there's not much of a delta any more (see ticket #2130).
Thanks for bringing the issue to my attention. Things are much better now. Still interested in making them even better, though, but will have to get creative for that.
-
Greg Hurrell
Status changed:
- From: new
- To: closed
-
Greg Hurrell
See also feature request #2113, which is a placeholder for the possibility of memoizing more things.
Add a comment
Comments are now closed for this issue.